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Arthur T. Benjamin, Ph.D. 

Professor of Mathematics, Harvey Mudd College 

Arthur T. Benjamin is a Professor of Mathematics at Harvey Mudd College. He 
graduated from Carnegie Mellon University in 1983, where he earned a B.S. in 
Applied Mathematics with university honors. He received his Ph.D. in 
Mathematical Sciences in 1989 from Johns Hopkins University, where he was 
supported by a National Science Foundation graduate fellowship and a Rufus P. 
Isaacs fellowship. Since 1989, Dr. Benjamin has been a faculty member of the 
Mathematics Department at Harvey Mudd College, where he has served as 
department chair. He has spent sabbatical visits at Caltech, Brandeis University, 
and University of New South Wales in Sydney, Australia. 

In 1999, Professor Benjamin received the Southern California Section of the 
Mathematical Association of America (MAA) Award for Distinguished College 
or University Teaching of Mathematics, and in 2000, he received the MAA 
Deborah and Franklin Tepper Haimo National Award for Distinguished College 
or University Teaching of Mathematics. He was named the 2006-2008 George 
Polya Lecturer by the MAA. 

Dr. Benjamin’s research interests include combinatorics, game theory, and 
number theory, with a special fondness for Fibonacci numbers. Many of these 
ideas appear in his book (co-authored with Jennifer Quinn), Proofs That Really 
Count: The Art of Combinatorial Proof published by the MAA. In 2006, that 
book received the Beckenbach Book Prize by the MAA. Professors Benjamin 
and Quinn are the co-editors of Math Horizons magazine, published by MAA 
and enjoyed by more than 20,000 readers, mostly undergraduate math students 
and their teachers. 

Professor Benjamin is also a professional magician. He has given more than 
1,000 “mathemagics” shows to audiences all over the world (from primary 
schools to scientific conferences), where he demonstrates and explains his 
calculating talents. His techniques are explained in his book Secrets of Mental 
Math: The Mathemagician ’s Guide to Lightning Calculation and Amazing Math 
Tricks. Prolific math and science writer Martin Gardner calls it “the clearest, 
simplest, most entertaining, and best book yet on the art of calculating in your 
head.” An avid games player, Dr. Benjamin was winner of the American 
Backgammon Tour in 1997. 

Professor Benjamin has appeared on dozens of television and radio programs, 
including the Today Show, CNN, and National Public Radio. He has been 
featured in Scientific American, Omni, Discover, People, Esquire, The New York 
Times, The Los Angeles Times, and Reader’s Digest. In 2005, Reader’s Digest 
called him “America’s Best Math Whiz.” 
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The Joy of Mathematics 


Scope: 

For most people, mathematics is little more than counting: basic arithmetic and 
bookkeeping. People might recognize that numbers are important, but most 
cannot fathom how anyone could find mathematics to be a subject that can be 
described by such adjectives as joyful, beautiful, creative, inspiring, or fun. This 
course aims to show how mathematics — from the simplest notions of numbers 
and counting to the more complex ideas of calculus, imaginary numbers, and 
infinity — is indeed a great source of joy. 

Throughout most of our education, mathematics is used as an exercise in 
disciplined thinking. If you follow certain procedures carefully, you will arrive 
at the right answer. Although this approach has its value, I think that not enough 
attention is given to teaching math as an opportunity to explore creative 
thinking. Indeed, it’s marvelous to see how often we can take a problem, even a 
simple arithmetic problem, solve it lots of different ways, and always arrive at 
the same answer. This internal consistency of mathematics is beautiful. When 
numbers are organized in other ways, such as in Pascal’s triangle or the 
Fibonacci sequence, then even more beautiful patterns emerge, most of which 
can be appreciated from many different perspectives. Learning that there is more 
than one way to solve a problem or understand a pattern is a valuable life lesson 
in itself. 

Another special quality of mathematics, one that separates it from other 
academic disciplines, is its ability to achieve absolute certainty. Once the 
definitions and rules of the game (the rules of logic) are established, you can 
reach indisputable conclusions. For example, mathematics can prove, beyond a 
shadow of a doubt, that there are infinitely many prime numbers and that the 
Pythagorean theorem (concerning the lengths of the sides of a right triangle) is 
absolutely true, now and forever. It can also “prove the impossible,” from easy 
statements, such as “The sum of two even numbers is never an odd number,” to 
harder ones, such as “The digits of pi (7t) will never repeat.” Scientific theories 
are constantly being refined and improved and, occasionally, tossed aside in 
light of better evidence. But a mathematical theorem is true forever. We still 
marvel over the brilliant logical arguments put forward by the ancient Greek 
mathematicians more than 2,000 years ago. 

From backgammon and bridge to chess and poker, many popular games utilize 
math in some way. By understanding math, especially probability and 
combinatorics (the mathematics of counting), you can become a better game 
player and win more. 

Of course, there is more to love about math besides using it to win games, or 
solve problems, or prove something to be true. Within the universe of numbers, 
there are intriguing patterns and mysteries waiting to be explored. This course 
will reveal some of these patterns to you. 


In choosing material for this course, I wanted to make sure to cover the 
highlights of the traditional high school mathematics curriculum of algebra, 
geometry, trigonometry, and calculus, but in a nontraditional way. I will 
introduce you to some of the great numbers of mathematics, including jt, e, i, 9, 
the numbers in Pascal’s triangle, and (my personal favorites) the Fibonacci 
numbers. Toward the end of the course, as we explore notions of infinity, 
infinite series, and calculus, the material becomes a little more challenging, but 
the rewards and surprises are even greater. 

Although we will get our hands dirty playing with numbers, manipulating 
algebraic expressions, and exploring many of the fundamental theorems in 
mathematics (including the fundamental theorems of arithmetic, algebra, and 
calculus), we will also have fun along the way, not only with the occasional 
song, dance, poem, and lots of bad jokes, but also with three lectures exploring 
applications to games and gambling. Aside from being a professor of 
mathematics, I have more than 30 years experience as a professional nmgluliWi 
and 1 try to infuse a little bit of magic in everything I teach. In fact, lit* ImM 
lesson of the course (which you could watch first, if you wiinl I In oil Ihe |N Q( 
mathematical magic. 


Mathematics is food for the brain. It helps you think prei Kely, di i 1 ||Vf|k Mini 
creatively and helps you look at the world front imilllplv p»l«|WllVf 
it comes in handy when dealing with numbers dins ||), an It m 
shopping around for the best bargain or trying lit ill 
read in the newspaper. Bui I hope (lull you uUo 
a new way to experience beauty, III lliv 
logical argument. Many people Hod 
of art. and malhemalleN olltflN 
If Elizabeth IIiuhII lliuwolll| hfHl H HMlI 
“l low do I count iheeV l et 
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Lecture Thirteen 
The Joy of Trigonometry 


Scope: We begin this lecture by using a right triangle to define sine, cosine, 
and tangent of any acute angle (an angle between 0° and 90°), along 
with their reciprocals, the cosecant, secant, and cotangent. We then use 
the unit circle to expand this definition to allow us to define these 
trigonometric functions for any angle. From the Pythagorean theorem, 
we prove that the square of the cosine plus the square of the sine is 
equal to 1 for any angle. We derive the formula for the cosine of the 
sum or difference of any two angles, leading to a host of other 
trigonometric delights. We end by examining the graphs of 
trigonometric functions and mention some of their applications. 

Outline 

I. Trigonometry comes from the Greek trigonometria — literally, the 

measurement of triangles. It allows us to calculate measurements pertaining 
to triangles that we could not easily do using standard geometry techniques. 

A. All of trigonometry is based on two important functions known as the 
sine function and the cosine function. We will initially define these in 
terms of a right triangle. 

1. We begin with a right triangle with one angle labeled a. The side 
that is opposite a is called the opposite side. The other side 
adjacent to a that isn’t the hypotenuse is called the adjacent side. 

2. We define the sine of a (abbreviated as “sin a”) to be the length of 
the opposite side divided by the length of the hypotenuse: 

sine = opposite/hypotenuse. 

3 . The cosine of a (abbreviated as “cos a”) is defined as the length of 
the adjacent side divided by the length of the hypotenuse: 

cosine = adjacent/hypotenuse. 

4 . The third most commonly used trigonometric function is the 
tangent function, which is the sine divided by the cosine. Because 
sine is opposite/hypotenuse and cosine is adjacent/hypotenuse, the 
tangent of a (abbreviated as “tan a ”) is their quotient: 

tangent = opposite/adjacent. 

B. We can now calculate some trigonometric values. For instance, let’s 
look at a classic right triangle with side lengths 3, 4, and (hypotenuse 
length) 5. 

1 . If the side opposite angle a has length 4, then sin a = 3/5, 
cos a = 4/5, and tan a = 3/4. 

2. Note that the complementary angle to a has a measure of 90 - a, an 
angle whose sine is 4/5, cosine is 3/5, and tangent is 4/3. 


3. It’s no coincidence that the sine of the second angle is the cosine of 
the first angle, and the cosine of the second angle is the sine of the 
first angle. Those values come straight from the definition: 
sin (90 - a) = cos a, cos (90 - a) = sin a. 


C. You should also be aware of three other trigonometric functions: 


function 

reciprocal of function 

can be written as 

secant 

cosine 

sec = 1/cos 

cosecant 

sine 

esc = 1/sin 

cotangent 

tangent 

cot = 1/tan 


II. The definitions that we’ve looked at so far allow us to define the sine. 

cosine, and tangent only for angles between 0° and 90° because that’s all wo 
can fit in a right triangle. A more general view of trigonometric funutlOM 
allows us to define these for any angle. 

A. We begin with the unit circle, which has a radius ol I . I lie unit olralc 
has the equation x 2 + y 2 = 1 . 


B. 



We draw an angle of measure a on the unit circle, let's InM Hf HIM 
that corresponds to angle a as (x,y). If we drop a line ll i nil (*, il | 
x-axis, we create a right triangle. We know lluil the 
of this triangle is v, the height is y, ami the li> | 

hypotenuse side is length I; lints, 

called ( v, i 
and 



to augla ii') lltnl literally takes iin hill 
sliia mill slim will he exactly the same as they 
tMI) - eos a, and sin (a i frill) sin u 

1 sel ol points ( v. »•) that satisfies 
* Ii os ,i sin a) Is on the unit circle, dial means that 
sin .ii I Mils htmoiix formula is usually written as: 

** il “ I , or simply as eos 1 + sin 2 *■ I . 
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III. The box below shows some other angles for our trigonometric vocabulary. 



A. Notice that we don’t have to memorize the tangents because they are 
simply the sine values divided by the cosine values. 

B. Note also that the arc tangent of 1 (the angle whose tangent is 1 ) is 45°. 
The arc sine of 1/2 (the angle whose sine is 1/2) is 30°. 

IV. Now we’re ready to look at some problems. 

A. We see a right triangle with an angle of 30°. What are the lengths of the 
other two sides of this triangle? 

1. Let b be the length of the hypotenuse. Since sin 30 = 1/2 and sin 30 
(opposite/hypotenuse) = 10 lb, then 10 lb = 'A. Thus, b = 20. The 
length of the hypotenuse is 20. 

2 . To find the length of the other side a, we’ll use the Pythagorean 
theorem. We know that b = 20, and because we’re dealing with a 
right triangle, we also know that 10 2 + a 2 = b 2 . We just saw that b 2 
is 20 2 , or 400, which tells us that a 2 is 300; therefore, a = V300 , or 
10^3 , or approximately 17.3. 

B. We have a base of length 26, a side of length 2 1 , and an angle of 1 5° 
between them. Can we find the area of this triangle? 

1. First, we’ll draw a new line, splitting the triangle into two right 
triangles. The opposite here has height h and the hypotenuse has 
length 2 1 ; thus, sin 15= h/2 1 . Hence, h = 21 sin 15, and from our 
calculator sin 15 = .2588, so h is approximately 5.435. 

2 . Knowing the height and the length of the base (given as 26), we 
can find the area of the triangle: 'Abh, or 'A (26)(5.435) = 70.66. 

V. We’ll now prove one of the most difficult identities in basic trigonometry 
using a tool from our geometry lecture. (Keep in mind that you may have to 
go through this proof more than once before it sinks in.) This identity is as 
follows: cos ( a-b) = cos a cos b + sin a sin b. 

A. Here’s the tool from geometry: For a line of length L that goes from 
point (x l5 >’i) to point (x 2 , >’ 2 ), by the Pythagorean theorem, we showed 
that l} is equal to (xj -x 2 ) 2 + (y\ -yf)'. 

B. We start our proof by looking at the unit circle. Focus on the triangle 
whose vertices are the origin, the point (0, 0); the point (cos a, sin a); 
and the point (cos b, sin b). We know that two of the side lengths of 
that triangle are 1 because they are radii of the unit circle. We want to 


calculate the length of the line L that connects (cos a , sin a) to 
(cos b, sin b). 

C. From the L 2 formula, we see L 2 = (cos a - cos bf + (sin a - sin b ) 2 . We 
next expand that equation. The first term expands to cos 2 a + cos 2 b - 
2cos a cos b. The second term expands to sin 2 a + sin 2 b - 2sin a sin b. 
Simplifying, cos 2 a + sin 2 a = 1, and cos 2 b + sin 2 b=\. The expression 
now reads: 2 - 2cos a cos b - 2sin a sin b. 

D. We now rotate the triangle so that the lower side is lying on the x-axis. 
Note that the lengths of the sides are still 1, and the length L hasn’t 
changed either. The angle that we’re looking at is angle a minus angle 
b, or a - b. What is the length of L2 

1. Look at the change of the ^-coordinates and the change of 
the y-coordinates. Because the side of the triangle is lying on 
the x-axis and has a length of 1, that lower point is (I, (I); 
because the upper point of the triangle corresponds lo angle 
a - b, it has coordinates (cos (a -b), sin (a -b)). 

2. According to the L 2 formula, we add the change in x-coordinales 
squared and the change in y-coordi nates squared: 

(cos(a - b) - l) 2 + (sin(a - b) - 0) 2 . 

3. When we expand that, we get: cos 2 (a - A) i I 2cos (</ />) I 
sin 2 (a-Z>). This equation is not as messy as il looks because 
cos 2 + sin 2 = 1 . Thus, we have: 2 - 2cos(a b). 

E. Now, we have to equate the two expressions that we found for I. 2 : 

2 - 2cos(a -b) = 2- 2cos a cos b - 2sin a sin b. We divide everything 
by 2 to get the desired formula: cos(a -b) = cos a cos b + sin a sin b, 

VI. Once we have that equation, we can prove many useful identities. (Any ■ 

truth in trigonometry is typically called a trigonometric identity.) 

A. For instance, look what happens when we set a = 90°: cos(90 - />) > 
cos 90 cos b + sin 90 sin b. But if you memorize cos 90 = 0 and ^ 
sin 90 = 1, that equation simplifies to: cos(90 - b) = sin b. We can \ 
calculate sin(90 - a), which is cos(90 - (90 - a)), = cos a. This 

that those formulas are true for any angle — not just for angles he twain 
0 and 90 degrees. 

B. We have a formula for cos(a - b), but what about cos(a + b)2 We 
simply replace b with - b , so that the formula reads cos(« - (-£>)) = 
cos(a)cos(-&) + sin(a) sin(-6). But cos(-6) is the same as cos b, and 
sin(-Z>) is the negative of sin b. When we plug those in, we get the 
equation: cos(n + b) = cos a cos b - sin a sin b. 

C. When a and b are the same angle, we have the double-angle formula : 
cos(2a) = cos 2 n - sin 2 a. 

D. We can do similar calculations with the sine function and show 
that sin(a + b) = sin a cos b + cos a sin b. In particular, when a and 
b are equal, this formula says that sin 2 a = 2sin a cos a. 
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VII. Instead of using degrees that go from 0 to 360, mathematicians use a 
measurement called radians, in which 360° = 2 tt radians. Hence 1 radian is 
360/2 ji degrees, approximately 51°. 

VIII. Because the graphs of trigonometric functions come from the unit circle, 
they have a nice periodic property. The sine and cosine functions can be 
combined to model almost any function that goes up and down in a periodic 
way, such as seasons, sound waves, and heartbeats. 


IX. We’ll close with the law of sines and the law of cosines. For any triangle, 
with angles A, B, C, and corresponding side lengths a, b, c: 


law of sines 

(sin A)/ a =(sin B)lb = (sin C)/c 

law of cosines 

c 2 = a 2 + b 2 - 2 ab cos C 


A. The law of cosines can be thought of as a generalization of the 
Pythagorean theorem. 

B. With the law of cosines, we can find the length of a missing side, C, in 
a given triangle. In our previous example, the remaining side had length 
c, which satisfies c 2 = 26 2 + 21 2 -2(26)(21)cos 15. Since cos 15 = 

.9659, we get c 2 = 62.2, or c is approximately 7.89. 

Reading: 

I. M. Gelfand and M. Saul, Trigonometry. 

Eli Maor, Trigonometric Delights. 

Questions to Consider: 

1. Although it is useful to memorize the values of sine and cosine for 0°, 30°, 
45°, 60°, and 90°, they can be easily derived from basic geometry. Try to do 
so. Once you know these values, then you can derive exact values for many 
other angles, as well. Use the double-angle formula to determine the exact 
value of the sine, cosine, and tangent of 15°. 

2. Prove the law of sines, which states that for any triangle with angles A, B, C, 

and corresponding side lengths a, b, c: = iin£. . Hint: To prove 

the first equality, draw a perpendicular line from vertex A to the line BC. Now 
compute sin(/l) and sin(S) and compare your answers. 


Lecture Fourteen 

The Joy of the Imaginary Number / 


Scope: Does V-T have any useful interpretation? Believe it or not, it actually 
does, although mathematicians were very slow to accept it, which is 
why they came up with such names as imaginary and complex to 
describe these quantities. Imaginary numbers have a simple arithmetic 
that make them easy to add, subtract, multiply, divide, and even find 
square roots, cube roots, and more. Moreover, once we learn to think 
outside the box of the real line and see complex numbers in a two- 
dimensional world, we can understand them even better. Complex 
numbers also have deep connections with trigonometry. Once we KM 
that connection, neither subject is nearly as complex as it initially 
seems. 

Outline 

I. Let's begin by thinking a bit about negative numbcrti, 

A. The ancient Greeks refused to accept the cxInUinol of til'd 
numbers, but when we think about numberi on |JMf| 
readily understand the concept of 
how to add, subtract, multiply, md dlvl 

B. In the real world, negative mill 
imagine that they do Id 1 
which we »up|‘ 
learn about I 

i. in 


3 . 


Mill 

i tin I, we (tel 

Imnylnnry iiiunlwro? 

)!, we dimply noIvc it* we would ihlug 
itlil vitncel, and the answer would he .1, 

' i"ii|t as we don't divide by 0, we don’t get into trouble. The 
solution to / • /In I . 

How about I//? Wlutl would be the reciprocal of this imaginary 
number? We multiply that number by I, but we write I as the 
fraction Hi. When we do that, we get i/i 2 ; i 2 is still -1 and i + (■ I ) 
gives us -/. 

We can also do the problem 1/2 i by just multiplying fractions: 
1/2(1//), or l/2(-/j, which is -i/2. 
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D. Addition and subtraction with imaginary numbers are also easy. For 
example, 3/ + 2 /= 5/; 3/ -2 / = 1/, or /; 2/ -3/ = -1/, or-/. 

E. How about 2 + /? The answer is just 2 + /. A number that’s of the form 
a + bi is called complex. A number such as 4/ is also a complex 
number, but it’s called an imaginary number , because the “real part” is 
0. Even a number such as 7 is a complex number, but it also happens to 
be a real number; thus, 7 can be thought of as 7 + 0/. (Another rule of 
arithmetic for imaginary numbers is 0/ = 0.) 

II. Let’s look at some arithmetic with complex numbers. Sample problems are 
shown in the table below. 


Addition 

(2 + 5/) + (5 + 3/) = 7 + 8/ 

Subtraction 

(2 + 5/) - (5 + 3/) = -3 + 2/ 

Multiplication 

(2 + 50(3 + 4/) = 6 + 15/ + 8/ -20 = -14 + 23/ 

Division 

(2 + 5/) x (3-4/) = (26 + 7/) = (26 + 7/) 

(3+4/) (3-4/) (3 2 +4 2 ) 25 


A. If we add (2 + 5/) + (5 + 3/), the answer is 7 + 8/. How about 

(2 + 5/) - (5 + 3/)? We subtract the real part, (2 -5) = -3, then we 
subtract the imaginary part, (5/ -3/) = 2/; thus, we get -3 + 2/. 

B. When we multiply imaginary numbers, we use FOIL, just as we did 
with polynomials. For the problem (2 + 5/)(3 + 4/), we get: 2(3) = 6, 
5/(3) = 15/, 2(4/) = 8/, and 5/(4/) = 20/ 2 . The only new idea here is that 
we can use the fact that i 2 = -1. Thus, we have 6 + 15 / + 8/ -20, which 
simplifies to -14 + 23/. 


C. 


D. 


Here’s another problem: (3 + 4/)(3 - 4/). (The second quantity is called 
the conjugate of the first quantity; that is, a + bi has the conjugate 
a - bi. When we use FOIL, we get 9 + 12/ - 12/ - 16/ 2 , which 
simplifies to 9 - 16(— 1), 9 + 16 = 25. 


Conjugates help make the division of complex numbers easier. For 
instance, notice that if we multiply any number of the form a + bi by its 


conjugate, a - bi, we get a 2 
(a + bi)(a - bi) = a 1 + b 1 . 

1. If we want to find the 

^ a -bi 


b 2 i 2 , but the -b 2 i 2 becomes +b 2 ; thus, 


1 

a + bi 


a-bi 


a-bi 

VTb 2 


$ 1,000 1 + 


.06 


reciprocal of a + bi. 


1 ta + bi, we simply multiply both the top and the bottom by the 
conjugate a - bi, as shown at right. Notice that the denominator of 
this fraction is a real number. 

In this way, we never have to have a complex number in the 
denominator; we can always eliminate complex numbers by 


TUn tv — „ u: r ' — 


multiplying the numerator and the denominator by the conjugate of 
the denominator. 

3. For example, to divide (2 + 5/)/(3 + 4/), we simply multiply the top 
and the bottom by 3 - 4/'. When we do that multiplication, we get 
6 + 15/ -8/ -20 i 2 = 26 + 7/ in the numerator and 3 2 + 4 2 = 25 in 
the denominator. That is, (2 + 5z')/(3 + 4/) = (26 + 7/)/25. 

III. We saw that real numbers exist on the real line, but complex numbers exist 

on what’s called the complex plane. Think of the x-axis and the y-axis, as 

we’ve been using in geometry. 

A. For instance, the number 1 + i will have a “real part” of 1 , an 
x-coordinate of 1, and ay-coordinate (called /') of 1 . To find 1 + i, then, 
we go to the right 1 and up 1. Think of they-axis as having the number 

0 where the two axes meet. As we go up, we see 1/', 2/', 3 i and as we 

go down the imaginary axis, ory-axis, we see -/, -2/, -31 

B. Let’s do a few more examples: For 2 + 2/', we go to the right and up 
2. For -2 + /', we go to the left 2 and up 1. For -3 - 2 /', we go to the K- It 
3 and down 2. 

C. For 2 + /', we go to the right 2 and up 1 . What happens if we multiply 
2 + /' by 1/2? Then, we’d have the number I + 1/2/, which would be 
halfway along the line from 0 to the point 2 + /'. I Ims, when we 
multiply by 1/2, the length of the line changes by I/.’. 

D. Similarly, if we multiply 2 + / by 2. we gel 4 i 21. The line, or \wlor, 
that goes from the origin, the point 0 i ()/, to the point I i will he 
twice as long as it was before. 

E. When we multiply a complex number by a real number, the line 
expands by a factor of that real number. If we multiply by a positive 
number, the line still points in the same direction. If we multiply by u 
negative number, the line points in the opposite direction. 

IV. We can “see” how to add two complex numbers, such as a + bi and c i ,//. 

by looking at their pictures on the complex plane. 

A. We see a line that goes from 0 to a + bi and a line that goes from 0 to 
c + di. Those two lines can form the sides of a parallelogram. The lop 
of the parallelogram is the point at which the sum of a + bi and c + dim 
meet. In other words, we start at a + bi, then add the vector that goes to 
c + di to get the sum. 

B. Look again at the line that goes from 0 to the point a + bi. We can 
define the length of that line to be the length of the complex number. 

1. The base of the triangle we see has length a and the height has 

length b. By the Pythagorean theorem, the hypotenuse of this right 

triangle will have length \Ja 2 +b 2 ; we define that to be the length 
of the complex number a + bi. 


to 
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2 . 

3. 


The angle near the origin of this triangle would be the angle 
associated with the complex number. That angle is measured 
counterclockwise from the x-axis. 

We can “see” how to multiply complex numbers in much the same 
way. In this case, we use two simple rules: Multiply the lengths 
from the origin and add the angles. We see, for example, if a 
complex number a + bi has with an angle of about 30° and length 4 
and if the point c + di has an angle of 120° and length 2, to obtain 
their product, we simply multiply the lengths (4x2 = 8) and add 
the angles (30° + 120° = 150°). Hence, the product will be: 


8(cos 150 + / sin 150) : 


- + / 


= -4^3- 


C. To summarize, we can add two points, such as ( a + bi) and ( c + di), by 
drawing a parallelogram. We can multiply those points by multiplying 
the lengths and adding the angles. 

D. Here’s another example: (2 + 2/)(-5 + 5/j. 

1. What’s the length of the line from the origin to 2 + 2/? The length 

of a + bi is \Ja 2 +b 2 ; thus, the length of 2 + 2/ will be V 2 2 +2 2 , 

or -\/8 . The length of 5 + 5/ will be yjs 2 +5 2 , or-v/50 . When we 
multiply those lengths together, we get V400 = 20 . 

2. The angle that cuts the first quadrant exactly in half at the point 
2 + 2/ is 45°. The angle for -5 + Si is 135°. When we add those 
angles together, we get 180°. 

3. Incidentally, as we mentioned in the trigonometry lecture, a 
mathematician would call the measure of the first angle tt/ 4 
radians, instead of 45°. The second angle would be 3 ji/ 4 radians, 
instead of 135°. Adding those together, we get 4jr/4 radians, or n 
radians. 

4. Returning to the problem, when we multiply those numbers 
together, we get something that has a length of 20 and an angle of 
1 80°. But 1 80° means 1 80° from the origin. Thus, we have a 
length of 20 pointing in the negative direction, or -20, as the 
answer. 

E. Why does this rule of multiplying the lengths and adding the angles 
work? Once again, Euler gives us the equation for this: e 10 = cos 0 + i 
sin 0 (e is a special number that we’ll talk about later). 

1 . Look at the unit circle again. Euler says that we can simplify the 
point on the unit circle at angle 0 can be called e' 0 . Note that we 
would normally call that point cos 0, sin 0 if we were in the x, y 
plane, but in the complex plane, we call it cos 0 + i sin 0, and we 
can simplify that to e 10 . 


1 1 
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2. Any complex number on the unit circle is of the form e' e . We can 
even get beyond the unit circle — that is, a point that has angle 0 
but has length R, represented by Re' e . 

3. For example, the number 2 + 2 / has a length of 78 and an 
angle of jt/ 4 radians. We can write this in polar form by saying 
2 + 2/ = V8e ,,t/4 . 

4. What if we were to stay on the unit circle and move 90°, or n/2 
radians? We then find ourselves at the point i, that is, e m . 

5. What happens when we multiply complex numbers? If we write 
those numbers in polar form let's say our first number wus 

W|/'' and our second number was then when we multiply 

those, we gol R i R 1 i ,n "' 10,1 , We're just using the laws of arithmetic 
and the law of exponents. The result, ,0)) , tells us to do 

exactly wlml our two simple rules say, namely, multiply the 
lengths and add the angles. 

F. What would Euler say about (/)(/')? 

1. We said that / = {e M2 )(e ml2 )\ that would give us e M , but we also 
know that (/)(/') = -l; thus, e m = -l. That says that e ' multiplied by 
the angle of 7i radians (that’s 1 80°) puts us at the real number I . 

2. If we rearrange that equation, it becomes: e m + 1 = 0, one simple 
equation that contains the five most important numbers and some 
of the most important relations in mathematics. What this 
“profound” equation says is simply that if you move 180° along 
the unit circle, you wind up at -1 . 

G. We can use Euler’s equation to derive many complicated trigonometric 

identities. For example, we know 

e' (2e> = cos (20) + /sin (20). But it’s also true that 

e ,m> = e ,e e' e = (cos 0 + / sin 0) 2 = (cos 2 0 - sin 2 0) + i(2 sin 0 cos 0). 

Comparing the real and imaginary parts gives us 

cos (20) = cos 2 0 - sin 2 0, and sin (20) = 2 sin 0 cos 0. 


V. Complex numbers can also help us with algebra. 

A. For instance, without complex numbers, we could not find a solution to 
the equation x 2 + 1 = 0. We know, however, that this equation has at least 
one solution, namely, /, because i 2 = -1 + 1 = 0. We can also find another 
solution because (-i) 2 is also-1. Similarly, the equation x 2 + 9 = 0 has 
two solutions, namely, 3/ and -3/, as does the equation x 2 + 7 = 0, which 
has solutions 77 / and - 77 / . 


B. With a more complicated algebraic expression, such as x 2 + 2x + 5 = 0, 


we use the quadratic formula: x = 


-b±4b 2 


-4 ac 


2 a 


. Plugging into that 


formula, we get the result shown below: 
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X = 


-2±V2 2 -4(1)(5) -2±V^l6 -2 + 4/ . , 

2 2 2 

C. Earlier, we discussed the fimdamental theorem of algebra, but we can 
express that in a more polished way using complex numbers. 

1. The fundamental theorem of algebra says that if P(x ) is a 
polynomial with real or complex coefficients, then we can always 
factor it in the form P(x) = (x - /*i)(x -r 2 )...{x- r„), where the 
roots r, r„ are complex numbers. We could factor any « th -degree 
polynomial into these n parts. 

2. We said earlier that the polynomial equation P(x) = 0 has, at most, 
n real solutions. In fact, we can say that it has “sort of exactly” n 
solutions, namely, r u r 2 , r 3 , ...r„. (We say “sort of exactly” because 
it’s possible that some of the roots were repeated.) 

VI. In this lecture, we’ve defined imaginary numbers; seen how to add, 

subtract, multiply, and divide them; and how to use them algebraically and 
geometrically. We’ve also seen Euler’s equation and some of its 
applications. 

Reading: 

A1 Cuoco, Mathematical Connections: A Companion for Teachers and Others, 

chapter 3. 

Paul J. Nahin, An Imaginary Tale: The Story of 7-7 . 

Questions to Consider: 

1. Once you overcome the obstacle of imagining 7-7 , it’s easy to imagine the 
square root of any complex number. For instance, can you find two numbers 
with a square of /? (Hint: They will both lie on the unit circle.) 

2. In general, for every positive integer n, every nonzero complex number has 
exactly n distinct « th roots. For instance, can you describe all of the n th roots 
of 1? Express your answer in terms e ,e and plot the points on the unit circle. 
Prove that the sum of these roots is always 0. 
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Lecture Fifteen 
The Joy of e 

Scope: The irrational number e = 2.71 828 1 82845904... is perhaps the most 
important number in calculus. Its importance stems from the fact that 
the exponential function e* has the property that it is equal to its own 
derivative. In fact, any time the rate of growth of a function is 
proportional to the function’s value, then & is inevitably present in that 
function. We’ll also discuss the inverse of the exponential function, the 
natural logarithm. As we’ll see, the number e also arises in a “tug-of- 
war” with 1 and infinity that allows us to compute compound interest 
and other important quantities with “ease.” 

Outline 

I. Let’s begin by creating e. 

A. We start with (1 + 1/10) 10 . The result is 2.593.... Next, we look at 
(1 + l/lOO) 100 . We’re doing two things here: We’re making the base 
closer to 1, and we’re making the exponent much bigger. The result for 
the second equation is 2.7048 1 . . . . 

B. Let’s try again: (1 + 1/1 OOO) 1000 . That result is 2.71692..., still close to 
2.7. In fact, as we take this process farther and farther out, as n gels 
larger and larger, (1 + 1 Inf gets closer and closer to the magical 
number e: 2.718281828459.... In mathematical terms, as n goes to 
infinity, e is the limit of (1 + 1 In)". 

C. We can generalize (1 + 1 lri) n : For any number*, if we take the limit as 
n goes to infinity of (1 + xinf we get e x 

II. The number e relates to compound interest. 

A. Suppose you put $1,000 in a bank account that earns (>% each yUTi 
After one year, how much money will you have? We can find I lit* 
answer by multiplying $ 1 ,000 by 1 .06, which gives us $ 1 ,000, 
Assuming that you didn’t take the interest out ol yum accminl. alter 
two years, you’ll have $1,000 x 1 .06 > 1.06, or $1,121.60 A l)cr throe 
years, you’ll have $1,000 x 1.06 3 about $1,191,02. A Her t years, 
you’ll have $l,000x 1.06'. 

B. Let’s focus on one year and suppose that instead of being compounded 
annually, the interest was compounded semiannually. Instead of giving 
you a lump sum of 6% at the end of the year, the bank gives you 1% 
after six months and another 3% when the year ends. That’s equal to 
$1,000(1.03) 2 , or $1,060.90. 

C. Suppose that your interest was compounded quarterly. That means that 
every three months, you’ll get 1 .5% interest. You figure the interest by 
$ 1 ,000( 1.0 1 5) 4 . If the interest is compounded monthly, you get 0.5% 
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each month: $ 1,000(1. 0005) 12 . If the interest is compounded daily, you 
figure the interest by: $1,000(1 + .06/365) 365 , which is $1,061.83. 

D. If the bank compounds the interest continuously, your interest rate will 
be 6%/n per time period. With $1,000, you’ll get: 

1. As we know from the formula we found for e \ if we raise 

(1 + .06 In) to the n' power, as n gets larger and larger, we get 
closer and closer to e 06 . When we calculate 1 ,000 x e 06 , we get 
$1061.84. Thus, with interest compounded continuously instead of 
daily, you earn an extra penny. 

2. But we also have a simpler equation than we had before. The 
general formula for 6% interest compounded continuously at the 
end of one year for $1,000 is: l,000(e° 6 ). For t years, the formula 
is: lOOOe 06 '. 

3. Starting with a principal amount p and an interest rate r, after 
t years with continuous compounding, the general formula for 
interest is: pe rl . 

III. Let’s do another application with e, this one involving homework. 

A. My students have turned in a number of homework assignments, but I 
don’t want to grade them. I randomly return the homework to my 
students for grading, but I don’t want any student to be in the position 
of grading his or her own paper. My question is: How likely is it that 
nobody gets his or her own paper? 

B. Suppose I have three students, A, B, and C. In how many ways can I 
return their homework papers? We know from our earlier lectures that 
there are 3 ! = 6 ways of returning three homework papers, but only two 
out of the six ways result in no student getting his or her own 
homework back. Thus, if I randomly return the homework, the chance 
that no student gets his or her own homework is 2 out of 6. 

C. If I have four students, then there are 4! = 24 ways of returning the 
homework papers. Of those 24 ways, only 9 result in no student getting 
his or her own homework back. The chances that no one gets his or her 
own homework are 9 out of 24, or 3/8, or .375. 

D. If we look at the chances with five students, six students, and so on, we 
see that the results get closer and closer to the same number. With five 
students, the chance is .366; with six students, it’s about .368; with 100 
students, it’s .3678.... 

E. Those results are strange. Whether I’m returning 100 papers back to 
100 students, or 10 papers back to 10 students, or 1,000,000 papers 
back to 1 ,000,000 students, the chance that nobody gets his or her own 
homework is practically .368. This magic number .368 is 1/e; it’s the 
reciprocal of e, 2.71828. 

F. Why should this be? If I have n students in the classroom, the chance 
that the first student will get his or her own homework is 1 In, and the 
chance is the same for the second student and so on. The chance that 
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you won’t get your own homework back is 1 - 1 In; therefore, the 
chances that no student gets his or her own homework are 
approximately (1 - 1 In)". Our earlier formula said that (1 + xlrif 
approaches e* as n gets large. That’s the situation we have here except 
that x is —1 . That is, we have (1 + -1/n)"; as n goes to infinity, that 
result is e~ l , or 1/e. 

IV. The number e was first used by Isaac Newton, but it was studied, analyzed, 
and named by the great Swiss mathematician Leonhard Euler. 

V. How does the function e' grow? 

A. Looking at the graph of that function, we see that e' grows fairly 
quickly; e 2 ' grows faster, and e 3 ' grows even faster. These are called 
exponential functions. 

B. L.et’s look at the function 5'. The number 5 is between e, 2.7 1 8, and e 2 , 
which is about 7.389. That means that 5' is between e'and e 2 '; therefore, 5 
is equal to e raised to some power between 1 and 2. 

1. Let’s say that 5 is e\ in which r is some real number between 1 and 

2. That means that we can replace 5 in the function 5' with the 
number e r raised to a power of t. Thus, 5' is the same as (/)'. By the 
law of exponents, that’s e rl . 

2. To find the number r in this expression, we need to look at 
logarithms. 

C. Logarithms are based on, initially, the powers of 10: 10" ' I. 

10 1 = 10, 10 2 = 100, 10 3 = 1,000, and so on. Negatively, 10 1 - I/It), 
10 2 = 1/100, and 10 3 = 1/1,000. We say that the logarithm of x, 
denoted log x, solves the equation 10 log r = x. 

1. The logarithm of x is the exponent to which we have to Ml*! 10 

in order to get x. For example, log 1 ,000 3 because I0 1 a IjQMi 

Log 100 = 2 because 10 2 = 100. Log I O' • y because we miN 10 
to a power ofy to get 1 0‘ . 

2. Can we find log VlO ? The result for 7 h) is It) 1 J ; thus, log <J\U 

is 1/2. 

3. What is log 512? A calculator tells us that log .1 1 2 about 2 700, 
Does that seem reasonable? We know that log 100 - .’ and log 
1,000 = 3. Because 5 12 is between 100 and 1.000, It follows that I 
log 512 should also be between log 100 and log 1 ,000, or between 

2 and 3. 

D. There are other useful rules lot logai ilhms I oi instance. v\c w said 
that log 10* = * for any \. Anolhci sensible rule is lO 1 ®** ■*. Again, it 
we think about the definition of log, that makes sense. 

E. Perhaps the most commonly used property of the logarithm is the one 
that states: The log of the product is the sum of the logs: log (.xy) - 
log x + logy. 

1. Look at the expression io logJt + log> '. According to the law of 
exponents, 10" * = 10 uX 10*; thus, the expression would equal 
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10 logx x 10 log v . We know, however, that 10 logJc isx and 10 log v isy, 
so that gives us x x y. On the other hand, we know from our useful 
log rule that 1 0 log(x - v) is also equal to xy. 

2. What have we done here? We’ve taken 10 to some power and 
obtained xy. We then took 1 0 to another power and obtained xy; 
therefore, the two powers must be equal. Equating these powers 
tells us that log x + log y must equal log xy. 

F. As a corollary to that last rule, we can also show what I call the 
exponent rule: log (x n ) = n log x. Let’s look at a couple of examples. 

G. Historically, logarithms were useful for converting difficult 
multiplication problems into more straightforward addition problems. 

H. Let’s illustrate the product rule and exponent rule for logarithms. 

1. If log 2 = 301... and log 3 = .477..., then log 6 = log (2x3) = 
log 2 + log 3 = .301. .. + .477... = .778.... 

2. Can we find log 5 knowing log 2 and log 3? We don’t need to 
use log 3 in this solution, but we do need to use log 10, which is 
1; thus, log 5 = log (10x 1/2), or log 10 + log 1/2, and we know 
log 1/2 because 1/2 is 2" 1 . We now have log 10 + log 2 _l , but by 
the exponent rule, log 2 _1 is -1 x log 2. This is equal to 1 - log 2, 
or 1 - .301, or about .699. 

3. Earlier in this lecture, we looked at log 5 12. Note that 5 12 is 2 9 . 
Log 2 9 , by the law of log exponents, is equal to 9 x log 2. Because 
log 2 is .301, that gives us 2.709, as we saw earlier. 

I. We’ve been talking about logarithms using base 1 0, but we can also use 
logarithms in other bases. We define log* x to be the exponent that 
solves b x ° ibX . 

1. For instance, as we noted above 2 9 is 5 12; thus, the log(base 2) of 
5 12 is 9 because we have to raise 2 to the 9 th power to get 5 12. 

2. The rules for logarithms in other bases are, in fact, virtually 
unchanged from the rules for base 10: log h b x = x, 

*og* (xy) = log* x + log* y , and log*(x") = nlog*x. 

3. We can also change from one base to any other base: Log(base b) x 
is log x ■¥ log b, where that log could be the log(base 1 0) or any 
other base. 

4. In chemistry and the physical sciences, the base 10 logarithm is 
probably the most popular. In computer science, base 2 is the most 
popular log. But in math, physics, and engineering, by far, the most 
popular base of the logarithm is the log base e, the natural log. 

Reading: 

Y. E. O. Adrian, The Pleasures of Pi, e and Other Interesting Numbers. 

Eli Maor, e: The Story of a Number. 
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Questions to Consider: 

1. With $ 1 0,000 in a savings account earning 3% interest each year, 
compounded continuously, about how much money will be in the account 
after 10 years? 

2. Starting with the famous formula for e: 1 + 1/1! + 1/2! + 1/3! + 1/4! + 

... = e, determine the following sums: 

1/1! +2/2! + 3/3! +4/4! + 5/5! + ... 

1 + 3/2! + 5/4! + 7/6! + 9/8! + 1 1/10! + ... 

1/1! +2/3! + 3/5! +4/7! + 5/9! + 6/11! + ... 
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Lecture Sixteen 
The Joy of Infinity 


Scope: What is the meaning of infinity? Are some infinite sets “more infinite” 
than others? What does it mean to sum an infinite number of numbers? 
These are the questions we’ll explore in this lecture. At first glance, it 
would seem that there are more positive numbers than positive even 
numbers, yet both sets have infinite size. Moreover, these sets have the 
same order of infinity because they can be put in one-to-one 
correspondence. In the same way, we can show that there are as many 
fractions as positive integers, yet we will see that there are “more” 
irrational numbers than rational numbers. There is even an infinite 
number of levels of infinity. But we dare not explore these concepts too 
deeply, lest we meet the same fate as Georg Cantor, the man who 
developed these concepts but battled depression at the end of his life. 

Outline 


I. Is infinity a number? 

A. Technically, infinity is not a number. It’s treated as if it’s something 
larger than any number. As we go to the right on the number line, we’re 
approaching infinity. 

B. Sometimes, though, we do treat infinity as a number, represented by the 
symbol oo . For instance, we might say that adding all the positive 
numbers equals infinity, although most mathematicians would say that 
the sum goes to infinity. 

C. For the sum to go to infinity means that it will be larger than any 
number you ask for — larger than a million, a trillion, even a googol. 

D. Though it doesn’t get as much attention, the cousin of infinity is 
negative infinity, denoted by - oo . The sum of all negative numbers 
gets smaller than any negative number you could ask for. 

E. As a mathematical convenience, we make statements such as 

1 /infinity = 0. That makes sense because if we divide 1 by bigger and 
bigger numbers, then the quotient gets closer and closer to 0. We can 
even say l/-oo = 0 because if we divide 1 by negative million, or 
negative billion, etc., the result gets closer to 0. 

F. On the other hand, we are never allowed to divide by 0; thus, we 
couldn’t say 1/0 =oo . The real reason we don’t allow that is because 
1/0 could be infinity or negative infinity. If we divide 1 by a tiny 
positive number, the answer will be a big positive number. If we divide 
1 by a tiny negative number, the answer will be a big negative number. 
In other words, as our denominator gets closer to 0 from the right, 
we’re going to infinity; as it gets closer to 0 from the left, we're going 
to negative infinity. That’s why we let 1/0 be undefined. 
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G. There are some infinite sums that add to something besides infinity. For 
instance, 1 + 1/2+ 1/4 + 1/8 + 1/16 + ... =2. Also, 1 + 1/1! + 1/2! + 

1/3! +... = e. We don’t necessarily get infinity as our answer even if we 
have an infinite sum. 

II. In this lecture, rather than using infinity as a number-like object, we will use 
it as a size. 

A. The size of a set (or cardinality of a set) S (denoted |s| ) is the number 
of elements in the set. For instance, if 5 = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, 
then |s| = 10. What’s the size of the set of all positive integers? 

Because that set has infinitely many elements, its size is infinity; the 
size of the set of all even numbers is also infinity. 

B. What is the size of the set of all fractions? Because there are more 
fractions than there are integers, the size of that set is infinite as well. 
The size of the set of real numbers between 0 and 1 is also infinite. 

C. As we will see, however, some infinities are more infinite than others. 

III. Let’s try a thought experiment. 

A. Suppose that every chair in my class is occupied by a student, and no 
students are chair-less. I could pair up students with chairs and 
conclude that there are as many students as there are chairs. This is 
called a one-to-one correspondence. 

B. We can use this same idea to compare the set of positive odd numbers 
with the set of positive even numbers. Not only are there an infinite 
number of both of those objects, but they have the same order of 
infinity because we can pair them up. Those sets, then, are infinite, and 
they have the same size. 

C. What about the sizes of the sets of all positive integers ( 1 , 2, 3, 4, 5, 
6,...) and all positive even integers (2, 4, 6, 8, 10, 12,...)? 

1 . I claim that that those two sets have the same size — not because 
they are infinite, but because we can pair them up. 

2. Here, 1 is paired with 2, 2 is paired with 4, 3 is paired with 6, and 
so on. 

D. Mathematicians say that any set that can be paired up with the positive 
integers is countable because we could essentially list all the numbers 
in the set just by counting. 

1. For example, the set of all integers (positive, negative, and 0) is 
countable. It can be put in one-to-one correspondence with the 
positive integers because we can list them all with no infinite gaps. 

2. If we list the integers as 0, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5..., 
eventually, we will reach every positive and every negative 
number. We can’t, however, list the positive numbers first, then the 
negative numbers, because we’d never finish with the first step. 
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E. These ideas were first put forth by the German mathematician Georg 
Cantor, but it took mathematicians decades to come to grips with them. 

IV. Now, let’s look at a larger set of numbers, the set of fractions. 

A. Of course, there are an infinite number of positive fractions, but are 
they countable? We might try listing the fractions out row by row, but 
that would leave infinite gaps. If we list the fractions out diagonally, 
however, we see that the set of rational numbers is countable. It has the 
same size as the set of positive integers. 

B. Can we find a set that is not countable? Surprisingly, the set of real 
numbers between 0 and 1 is uncountable. We can show this with a 
proof by contradiction. 

1. Suppose you begin your list with the number .3 1415926...; your 
second number is .12121212...; your third number is .500000; and 
your fourth number is .61803399.... 1 can use your list to create a 
real number that can’t be on the list. 

2. I begin with your first number, .31415926.... I add 1 to the first 
digit of that number to change it to 4. Then, I add 1 to the second 
digit of your second number to change that digit to 3. 1 can also 
change the third digit of your third number, the fourth digit of your 
fourth number, and so on. In that way, I create a number that is not 
on your list. 

3. Let’s say I created the number .4311.... How do I know that 
number is not the millionth number on your list? It couldn’t be, 
because it will have a different digit in the millionth digit past the 
decimal point. The number I’ve created, then, can’t be the first, the 
second, or the millionth number on your list. 

4. Therefore any attempt to list the real numbers is doomed to failure: 
the list is guaranteed to be incomplete. 

V. We know that the set of positive real numbers and the set of real numbers 
between 0 and 1 are both infinite, but the first set is countable and the 
second set is not. We now need to come up with different notations to 
represent these two different levels of infinity. 

A. We use the symbol K 0 (“alef nought”) to denote the size of the set of 
positive integrers. (The symbol alef is the first letter of the Hebrew 
alphabet.) Anything that can be put into correspondence with the 
positive integers, any countable set, has size, or cardinality K 0 . The set 
of real numbers between 0 and 1 has a greater level of infinity; 
mathematicians usually denote that level of infinity by the letter c, 
where c stands for continuum. 

B. Can we find a set that is bigger than cl For example, is there twice as 
much “stuff’ in the interval between 0 and 2 as there is in the interval 
between 0 and 1 ? Both of these are infinite sets, but there is an 
elementary way to pair up the numbers between these two sets. 
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1. Let’s look at a triangle. Inside the triangle, we have a segment of 
length 1 and, at the base, we have a segment of length 2. At the 
top, we have a laser beam shooting down, connecting every point 
between 0 and 1 with another point between 0 and 2. We can pair 
every point in the first interval with a point in the second interval. 

2 . What we’re really looking at here is the function y = 2x. Every 
point on the x-axis is associated with a point on the y-axis by way 
of that function. This shows that the size of the set of real numbers 
between 0 and 2 is the same as the size of the set of real numbers 
between 0 and 1 . In other words, both have size c. 

( . What about the si/c of the set of all real numbers from negative infinity 
to positive infinity? Is that set bigger than the set between 0 and 1? 

I As loop as we draw any function that more or less increases from 
negative infinity to positive infinity, we can create a one-to-one 
correspondence. The function we see here is a trigonometric 
fund ion: v tan(n(x 1/2)). 

2. Between every number from 0 to 1 , we can get every real 

number — positive, negative, and 0. In other words, the size of the 
set of real numbers is the same as the size of the set of real 
numbers between 0 and 1 . Both still have size c. 

D. Can we find a set that has a size bigger than c? 

1. Let’s look at the plane — that’s the set of points inside the unit 
square (side length = 1 ). If there are an infinite number of points 
between 0 and 1, there are certainly an infinite number of points in 
the square that is drawn from 0 to 1 horizontally and from 0 to I 
vertically. Amazingly, however, even this set can be put in one-to- 
one correspondence with the set of real numbers between 0 and I . 

2 . Let’s say that x is 0.r,/yy 7 . . . , and y is 0.r 2 r 4 r b r g .. . . That’s an 
ordered pair inside the unit square. We will associate that pair wilh 
the real number 0 .rir 2 r^r 4 r s r 6 . . . . 

3. If we start with, say, the point 0.31415926..., that pairs up with the 
ordered pair 0.3452... and .1196.... Any number between 0 and I 
can be turned into a pair of numbers between 0 and 1, and vice 
versa. 

4. To put it another way, the size of the set called it 2 (the set of all 
pairs of real numbers; pronounced “R two”) is c, where c stands 
for “continuum.” The sizes of the sets of all triples, quadruples, 
and so on of real numbers are also c. We still haven’t found a set 
that is bigger than the size of the set of real numbers. 

E. There is such a larger set: the set of all curves in the plane. That is, 
there are more curves than there are real numbers to assign them. 

F. Here is another set whose size is bigger than c: the set of all subsets of 
real numbers. That is, there are more subsets of real numbers than there 
are real numbers to assign them. 
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VI. In this lecture, we’ve shown that a set is infinite if the size of that set 
exceeds any given number. The sets of integers and rational numbers are 
countable because we can list them; these have a size called N 0 . The real 
numbers are uncountable and have size c. Finally, there are infinitely many 
levels of infinity. Here’s a question to think about: Are those infinitely 
many levels of infinity countably infinite or uncountably infinite? 

Reading: 

Edward B. Burger and Michael Starbird, The Heart of Mathematics: An 

Invitation to Effective Thinking, chapter 3. 

William Dunham, Journey through Genius: The Great Theorems of 

Mathematics, chapters 1 1-12. 

Eli Maor, To Infinity and Beyond: A Cultural History of the Infinite. 

Questions to Consider: 

1. Prove that the number of irrational numbers between 0 and 1 is 
uncountable. 

2 . Imagine a red robot that produces 10 billiard balls at a time, numbered 1 
though 10, then 1 1 through 20, then 21 through 30, and so on. Meanwhile, 
each time the red robot creates 10 balls, an evil green robot destroys a ball. 
In the first round, it destroys ball 10; in the second round, it destroys ball 
20; in the third round, it destroys ball 30; and so on. At the end of the 
process, which balls remain? (Although this is an infinite process, we can 
imagine it happening in a finite amount of time. Imagine that round 1 
occurs an hour before midnight, round 2 occurs half an hour before 
midnight, round 3 occurs a third of an hour before midnight, and so on. The 
challenge is to describe the situation “at midnight.”) 

3 . Bonus question: now suppose instead that after the first round, the green 
robot destroys ball 1 ; after the second round, it destroys ball 2; after the 
third round, it destroys ball 3; and so on. At the end of this process, which 
balls remain? 
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Lecture Seventeen 
The Joy of Infinite Series 

Scope: What does it mean to add up an infinite number of numbers? In this 

lecture, we demonstrate that “some sums sum” to infinity. For example, 
the harmonic series 1 + 1/2 + 1/3 + 1/4 + 1/5 + ... gets arbitrarily large. 
But other sums stay small: for example, the geometric series 1 + 1/2 + 
1/4 + 1/8 + 1/16 +... actually equals 2. More surprising still, some 
infinite series can be rearranged to obtain an entirely different sum. 

Outline 

I. Let’s look ill several proofs of the bold statement .999999... = 1. 

A. I lere’s the most elementary proof: We agree that 1/3 = .333333333.... 
If we multiply 3 x .333333..., we get .999999.... We also know that 

3 x .333333 = 3(1/3), but 3(1/3) is exactly equal to 1. If we follow that 
chain oflogic, we get: .999999... =3x .333333... =3(1/3)= 1. 

B. Here’s another proof: Let S = .999999... 5; then 105' = 9.999999.... 
Subtracting, we get, 95 = 9, hence 5=1. 

C. Here’s yet another proof: We agree that .999999. . . must be less than or 
equal to 1 . That means that 1 - .999999. . . is greater than or equal to 0. 
But 1 - .999999.... would be 0.000000.... We can say that either that 
difference is 0 or that it’s smaller than any positive number and, thus, 
must be 0. We have, then, two quantities, 1 and .999999..., whose 
difference is 0, and if two quantities have a difference of 0, they must 
be the same. 

D. In summary, we could say that .99 is close to 1 and .999 is even closer 
to 1, but .999999... is as close to 1 as desired. And for that reason, we 
say that those quantities are equal. 

E. Another way of looking at .999999. . . is as an infinite sum, the topic for 
this lecture. Technically, .999999... = .9 + .09 + .009 + .0009..., and 
we’re interested in what happens when we add an infinite number of 
numbers together. In general, we say that a series, such as a, + a 2 + a 3 + 
a 4 + ..., has a sum of 5 if the sum gets arbitrarily close to 5. As an 
example, .9 + .09 + .009 + ... gets arbitrarily close to 1. 

II. Let’s look at the example: 1 + 1/2 + 1/4 + 1/8 + 1/16 + ... = 2. 

A. Imagine that the distance between me and a table is 2 feet. If I walk 
halfway toward the table. I’ve just walked 1 foot. If I walk half the 
distance again, I’ve walked 1/2 foot. If I walk half the distance again. 
I’ve walked 1/4 foot. With every step I take, I’m walking half as much 
as I did with the previous step. Technically, I never reach the table, but 
I get arbitrarily close to the table. That’s why we say that the sum 
1 + 1/2 + 1/4 + 1/8 + ... = 2. That sum gets as close to 2 as we desire. 
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B. As an infinite sum gets closer and closer to a single number, it is said to 
converge. If it doesn’t converge, it is said to diverge. 

1. For example, the sum we just looked at converges to 2. The earlier 
example, .9 + .09 + .009 + ..., converges to 1. In contrast, the sum 
1+2 + 4 + 8 + 16 + ... diverges to infinity. 

2. A sum can diverge without getting larger. For instance, the sum 

1 - 1 + 1 - 1 + 1 - 1 ...is first 1 , then 0, then 1 again, then 0, and 
so on. Because that sum is not getting closer to any real number, 
we say that it diverges. 

3. In order for a sum to converge, the terms of the sum must get 
closer to 0; otherwise, the sum will not get closer to a real number. 

4. For example, the series 1 + 1/2 + 1/4 + 1/8 + 1/16 + ... is known 
as a geometric series, which has the form 1 + x + x 2 + x 3 + x 4 . . . . 

In order for the terms to be getting closer to 0, the number x must 
be between -1 and +1. ' 

C. Here is the formula for the geometric series: For any number x strictly 
between-1 and+l,the series 1 + x + x 2 + x 3 +x 4 + ... = 1/(1 -x). Let’s 
look at a proof of that formula. 

1 . Let 5 = 1 + x + x 2 + x 3 + x 4 . . . . Multiplying that equation by x, on the 
left, we have x(5); on the right, we have x + x 2 + x 3 + x 4 + .... Taking 
away the “excess,” we have 5 - x5 on the left, or 5( 1 - x); on the right, 
we’re left with 1 . Solving for 5, we get 5 = 1/(1 — jc). 

2. Let’s do the example we saw earlier: When x = 1/2, thenl+ 1/2 + 

1/4 + 1/8 + 1/16 + ... = — ! — , but the denominator, 1 - 1/2, is 
1—1/2 

equal to 1/2; the answer, then is — 5— , which is 2. 

1/2 

3. When x = -1/2, the geometric series tells us that 1- 1/2 + 1/4 - 

1/8 + 1/16 - ... = 5 , or — , which is 2/3. 

1 — (—1 / 2) 3/2 

D. Let’s go back to the number that we started with: .999999. . . . 

1. We can write that number as .9 + .09 + .009 + .... That’s not a 
geometric series yet, but we can factor out a .9 from everything, 
leaving us with .9( 1 + .0 1 + .00 1 + .000 1 + . . . ). 

2. Those terms are the quantity 1/1 0 th raised to higher and higher 
powers. In other words, we’ve pulled out a factor of 9/10 and 
we’re multiplying it by 1 + 1/10 + 1/10 2 + 1/10 3 + 1/10 4 + .... 

3. Adding, that infinite series is ^ . In other words, we have 

1-1/10 

9/10(10/9), which is 1. That’s our last proof of the fact that 
.999999... = 1. 

E. When you use the formula for the geometric series, you must be careful 
that the x you’re using is strictly between -1 and 1 ; if x is greater than 
or equal to 1 or less than or equal to -1, then the formula doesn’t work. 
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For instance, if we let x = 2, then the geometric series produces the 

nonsensical result that 1 + 2 + 4 + 8+ 16 + ... = — ! — , which is —1 . 

1-2 

III. Let’s do an application of the geometric series. 

A. Suppose a ball is dropped from a 50-foot building, and the ball always 
rebounds to 80% of the height from which it was dropped. How far 
does the ball travel? 

B. Obviously, the ball goes 50 feet down originally, but then it travels up 
80% of that, or 40 feet. Then, it drops 40 feet and rebounds up 80% of 
that, or 32 feet. It drops 32 feet, then rebounds up 25.6 feet, then drops 
25.6 feet, and so on. What’s the total amount that the ball travels? 

C. We can write this out as a geometric series, as shown below. 


50 + 2(50)(.8) + 2(50)(.8) 2 + 2(50)(.8) 3 + 


Simplifying: 50 + 80(1 + .8 + ,8 2 + 

,8 3 + 


f 1 

f 1 1 


Solving: 50 + 80 = 50 + 80 


= 450 ft 

U-.sJ 

I1/5J 



IV. If a sum a\ + a 2 + a 3 + . . . converges, we know that its terms must go to 0, 
but does that guarantee that the sum converges? Surprisingly, the answer is 
no. We can understand this by looking at the harmonic series : 1 + 1/2 + 

1/3 + 1/4+ 1/5 + .... 

A. Before we look at this proof, note that the harmonic series was given its 
name by the ancient Greeks. They noticed that strings with lengths of 1 , 
1/2, 1/3, 1/4, and 1/5, and so on, when plucked, tended to produce 
harmony. 

B. Now let’s look at the proof that the harmonic series goes to infinity. 

1. If we take 1/2 + 1/3 + ... + 1/9, we’re adding nine terms, and you 
would agree that each of those terms is bigger than 1/10. Thus, the 
sum of those nine terms must be at least 9/10. 

2. Now, let’s look at the next 90 terms, the numbers 1/10, 1/1 1, ..., 
1/99. We’ve just added 90 more terms, and each of those terms is 
bigger than 1/100; the sum of those 90 terms is at least 90(1/100), 
which is 9/10. Thus, the sum of those 90 terms is bigger than 9/10. 

3. In the same way, each of the next 900 terms is bigger than 1/1,000, 
which means that each of those terms is bigger than 9/10. Then, the 
next 9,000 terms also add to something bigger than 9/10, and the 
next 90,000 terms add to something bigger than 9/10. 

4. In this way, the sum of all these terms is bigger than 9/10 + 9/10 + 
9/10 + .... This sum gets arbitrarily large; thus, it diverges to 
infinity. 
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C. Could we scale down the harmonic series somewhat? What if we cut 
down every term by 100? Does the sum 1/100 + 1/200 + 1/300 + 
1/400+ ... converge or diverge? 

1. We could factor 1/100 out of those terms, and we would be left 
with 1 + 1/2 + 1/3 + 1/4 + 1/5 + ..., but we know that series 
diverges to infinity and 1/100 of infinity is still infinity. 

2. Interestingly, increasing the denominators of the harmonic series 
slightly brings about enough of a change to get the series to 
converge. Instead of using the denominators 2, 3, 4..., we use 2 101 , 
3 101 , and 4 101 . That makes the denominators a little bit bigger, 
which makes the fractions a little bit smaller, and the sum, then, 
will be less than infinity. 


Let’s now turn to what mathematicians call an alternating series. 

A. We start with the numbers a x > a 2 > a 3 > a 4 > . . . > 0. If these numbers 
are getting closer and closer to zero, then the sum of a { -a 2 + a 3 -a 4 + 
a 5 - a 6 + ... will converge to a single number. For example, 1 - 1/2 + 
1/3 - 1/4 + 1/5 - 1/6 + ... must converge. 

B. To prove this, think of starting at 1, then subtracting 1/2, then adding 
1/3, then subtracting 1/4, adding 1/5, subtracting 1/6, adding 1/7, 
subtracting 1/8, and so on — getting closer and closer to a single point. 

C. The sum is honing in on a single number, which incidentally, is .693 . . . , 
the natural log of 2. The explanation for that, however, requires 
calculus. 


D. Let’s look again at the same series: 1 - 1/2 + 1/3 - 1/4 + 1/5 - 1/6 + .... 
Notice that the denominators consist of all the positive numbers, and all 
the odd denominators are counted positively and all the even 
denominators are counted negatively. Knowing this, we can add that 
series up in a slightly different way. 

1. Consider the series shown below: 




-L + (I__Ll_± + fi -±).± +m 

12 {l 14 J 16 {9 18 ) 20 


2 . 


3 . 


Even though it looks different, this is just a rearrangement of the 
original series: every odd denominator is added once and every even 
denominator is subtracted once. 

Next, we’ll group those numbers, which results in: 

_LI !_! J_ 1 1 11 1 

2 4 6 8 + 10 


That is equal to: 


J_ J 1_ J__ 

12 14 16 18 20 + 

,1111111 

1 — + + + 

2 3 4 5 6 7 8 


or half the original series. 

We started with the series 1 - 1/2 + 1/3 - 1/4..., and when we 
rearranged it, we paradoxically wound up with half of the original 
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series. In fact, we can rearrange these same sets of numbers to 
obtain any sum we want. 

4. The lesson here is that the commutative law, a + b = b + a, can fail 
when adding infinite numbers of positive and negative terms. 


Reading: 

Y. E. O. Adrian, The Pleasures of Pi, e and Other Interesting Numbers. 

Daniel D. Bonar and Michael J. Klioury, Real Infinite Series. 

Questions to Consider: 

1. Prove 1/2 + 1/6 + 1/12 + 1/20 + 1/30 + ... = 1, where the first denominator 
is 1 x 2, the second denominator is 2x 3, and so on. (Hint: 1/12 = 1/3 - 1/4). 

2. Suppose that in the harmonic series, we throw away all terms with the 
number 9 in the denominator (i.e., we eliminate such numbers as 1/9, 1/19, 
1/29, 1/97, 1/3141592, and so on). Show that this 9-less series converges. 
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Lecture Eighteen 
The Joy of Differential Calculus 


Scope: Calculus is the mathematics of how things change. In algebra, we 

learned to measure the slope of a straight line. But how do we measure 
slope on a curve or, more precisely, the slope of the tangent line at any 
point on a curve? By computing slopes of secant lines and working 
with the concept of a limit, we can answer this important question. 
Beginning with the derivative (rate of change) of the function y = x 2 , 
we learn such rules as the power rule, product rule, and quotient rule 
that allow us to compute derivatives of polynomials and more 
complicated functions, including trigonometric functions. 

Outline 

I. The words calculus and calculate have the same root, which is calculus, 
literally meaning “pebble.” Pebbles were the first calculating devices. In 
calculus, we learn to calculate how things grow and change. There are many 
applications for calculus, particularly in astronomy, physics, chemistry, 
economics, and engineering. 

A. In this first lecture on calculus, we’ll have fun with functions, seeing 
how they grow and change over time. 

B. In the next lecture, we’ll find an approach for approximating any 
function with a polynomial, the simplest of functions. 

C. In our third lecture on calculus, we’ll explore the fundamental theorem 
of calculus, which allows us to calculate areas and volumes that are 
impossible to find using only the tools of geometry and trigonometry. 

II. We begin with the study of slopes, which we encounter every day. Any time 
one quantity varies with another quantity, such as in calculating miles per 
gallon or price per pound, the idea of slope is involved. 

A. Mathematically, the simplest slopes are straight lines, where the slope 
is constant. For instance, we know from our earlier discussion of 
algebra that the function y = 2x + 3 produces a line with a slope of 2. 

B. The line for the function y = 4x - f has a much steeper slope, 4. The line 
y = -x has a constant slope of-1 . Finally, a constant function, which is 
also a straight line, such as y = 5, has a slope of 0. 

C. Lines have the same slope everywhere, but calculus applies our 
knowledge of lines to curves, which are not nearly as simple. For 
instance, let’s look at the parabola y = x 2 + 1. It doesn’t make any sense 
to try to find the slope of a parabola, because it’s constantly changing. 
But we can ask how fast the function is growing at a specific point. 

1. When x = 3 on this graph, y = 3 2 + 1 , which is 1 0. How fast is the 
function growing at the point (3, 10)? We’re interested in the slope 
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of the line that just touches the graph at the point (3, 10). That line 
is called the tangent line, and our mission is to calculate the slope 
of that tangent line. 

2 . We need two points to figure out the slope, but we can use a point 
that’s close to (3, 10) that also lies on the parabola. Let’s look at 

x = 3 + h] the y value for that would be (3 + hf + 1 . When we expand 
that, we get h 2 + 6h + 10. Now we have two points on the parabola. 
The first point, (x h yi), is equal to (3, 10). The second point, (x 2 ,y 2 ), is 
equal to (3 + h, h 2 + 6h + 10). 

3. To calculate the slope of the line that goes through those two 
points, we have to calculate the change in y divided by the change 
in x. The symbol used in calculus to express change in is delta, A. 
Thus, to calculate the change in y divided by the change in x, we 
look at Ay/ Ax. Algebraically, that’s equal to (y 2 -yi)/(x 2 -x,). 

4. The change in y is h 2 + 6h + 10-10, and the change in x is 

3 + h - 3. Simplifying, that’s (h 2 + 6 h)/h; when we divide by h, 
we’re left with h + 6. 

5. That result tells us that the slope of the line that goes through the 
point (3, 10) and the point very close to (3, 10) is equal to h + 6. 

As we let h get closer to 0, the slope of that line gets closer to 6. 
When h is 0, 6 + h becomes 6; therefore, the slope of the tangent 
line is 6 whenx = 3. 

D. We could go through the same argument for other points on the 
parabola. For instance, we could use the same algebra to find the slope 
of the point (x, x 2 + 1), which is simply 2x. When x = 1, the slope of 
that tangent line is 2. When x = -3, the slope of that tangent line is -6. 
When x = 0, the slope of that tangent line is 0. 

1. In general, for the function y = x 2 + 1 , the slope at the point x is 
equal to 2x, and we represent that with the notation y' = 2x. The 
term for y' is the slope function or the first derivative. 

2 . Note that if we raise or lower the function y = x 2 + 1 , the tangent 
line still has the same slope as it did before. If we’re looking at the 
function x 2 + 17, orx 2 , we still have y' = 2x. 

E. The official definition of the derivative is as follows: For any function 
y = fix), we define y' as (fix+h ) - fix))/h (that’s the change iny divided 
by the change in x) and we take the limit of that as h goes to 0. 
Calculating this is called differentiation. Other notations for y' include 
/' (x) and dy/dx. 

F. As we just saw, if y = x 2 , its derivative is y ' = 2x. By using the same 
kind of logic we just used, we can come up with some general rules for 
calculating derivatives. 

1. For example, ify = x 3 , then y' = 3x 2 . Ify = x 4 , then y = 4x 3 . In 
general, ify = x", then y' = nx“ Even when the exponent is 1, 
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y = x, the derivative would be lx° = 1. That makes sense because 
the slope of the line y = x is constantly 1 . 

2 . We can also multiply by a constant when we’re differentiating. For 
instance, given thaty = x 2 has the derivative 2x, theny = 10x 2 
would have the derivative 1 0(2x), or 20x. 

3 . Here’s another simple rule: the derivative of the sum is equal to the 
sum of the derivatives. For instance, if we know that the derivative 
of 4x 3 = 12x 2 , the derivative of 8x 2 is 16x, the derivative of-3x is 
-3, and the derivative of 7 (that’s a constant function, y = 7) is 0, 
and we want to find the derivative of the sum of all those 
functions, then we use this rule to get 12x 2 + 16x - 3. 

III. Now that we know how to calculate some derivatives, let’s look at what we 

can do with this knowledge. 

A. We begin with the function y = x 2 - 8x + 10. Looking at a graph of that 
function, we might ask: Where is that function minimized? Remember 
we said that when a function reaches its low point, the slope of the 
tangent line is 0. Wherever a function reaches its minimum or its 
maximum — that is, whenever we go from decreasing to increasing or 
from increasing to decreasing — the slope of the tangent line is 0. 

1. We can find where this function is minimized by finding where the 
derivative of that function is equal to 0. The derivative of 

x 2 - 8x + 10 is2x-8. 

2 . When does that equal 0? Solving 2x - 8 = 0, we get x = 4. That 
function is minimized when x = 4. 

B. Let’s do another application, this one involving Laurel’s Lemonade 
Stand. For my daughter’s lemonade stand, we decided that if she 
charged x cents per cup, she would sell (50 - x) cups in one day. 

1 . If Laurel sells (50 - x) cups, then her revenue is x(50 - x), which is 
50x -x 2 . That’s the revenue function, which we’ll call R(x). The 
graph of that function is an inverted parabola. Where is that 
function maximized? 

2 . We set the derivative of 50x - x 2 = 0; thus, 50 - 2x = 0. That equals 
0 whenx = 25. If Laurel charges 25 cents, she can expect to earn 
25(50 - 25), or 625 cents, or $6.25. 

C. Laurel’s sister, Ariel, wants to create a box where Laurel can keep her 
supplies. She will make the box, without a lid, from a 12-inch piece of 
cardboard. To create the box, she cuts four x-by-x squares out of the 
comers of the cardboard and folds up the edges. What will be the 
volume of the box? 

1. The volume of a box is length times width times height. If Ariel 
cuts out an x-by-x square from each of the comers, then the length 
of each side will be 12 - 2x; the width will also be 12 - 2x, and the 
height when the tabs are folded up is x. Thus, the volume is 
(12 - 2x)(12 - 2x)x; if we expand that, we get 4x 3 - 48x 2 + 144x, 
which we call v (x). 
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2 . How can we maximize that volume? We set the derivative of the 
volume equal to 0. Using the power rule and sum rule, we get 

v' (x) = 12x 2 — 96x + 144 = 12(x 2 - 8x + 12). 

3. Setting this equal to zero, and dividing by 12 gives us 

x 2 - 8x + 12 = 0. We can then factor that polynomial to get 
(x - 6)(x - 2) = 0. 

4. Of course, the product of those two numbers can be 0 only if one 
of the numbers is itself 0. That means either x-6 = 0orx-2 = 0. 
Thus, to determine where the volume of the box is maximized, we 
only need to consider when x = 6 or when x = 2. 

5. We can tell, either by looking at the graph of that function or by 
actually plugging in the numbers, that when we let x = 6, the 
volume of the box is 0. When x = 2, however, we get the biggest 
volume: (12 - 2x)(12 - 2x)x = (12 - 4)(12 — 4)2 =128 cubic inches. 

D. So far, we’ve solved only problems that involve polynomials, but the 
power rule is actually even more powerful than it sounds. 

1. Again, the rule is that the derivative of x" is nx"\ and it works for 
any exponent n, even negative integers or fractions. 

2 . For instance, y = x 1 is the function y = 1/x. The derivative of that, 
by the power rule, would be -l(x”‘ _ *), or -l(x 2 ). In other words, 
y=-i/x 2 . 

3. If we were interested in differentiating y = 1/x 2 , that would be 
y = x~ 2 . If we differentiate that, we get -2x~ 3 , or -2/x 3 . Here’s a 

derivative that we’ll see later, y=y[x= x 1/2 . If we differentiate 

that, we get y' = l/2(x 1/2 “ *) = l/2x '' 2 , which equals — L 

2-v/x 

IV. We might also be interested in differentiating the trigonometric function 
and the exponential function. Such functions model how sound waves travel 
or how money grows, and are well worth memorizing. 

A. The derivative of the sine function is the cosine function. That is, if 
y = sin x, then y' = cos x. 

B. The derivative of the cosine function is the negative of the sine 
function. That is, if y = cos x, then y' = -sin x. 

C. The most important function in calculus is the function y = e x because, 
as mentioned earlier, the derivative of e* is y' = e\ This function tells us 
that when we plug in x, not only do we get a value of y, but we also get 
the slope of the tangent line — how fast that function is changing. 

D. The derivative of the natural log of x. In x, is equal to 1/x. 

V. Let’s try to clarify why the derivative of sin x is cos x. 

A. We can look at a graph of the sine function just to see how it increases 
and decreases. Here, we have the graph ofy = sinx. Let’s estimate the 
slope at various points along the graph. 
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B. For instance, when x = 0, the slope of the sine function looks close to 1 . 
At the point x = jc/ 2, at 90°, we have a slope of 0. Down at n, at 1 80°, 
we have a slope of -1 . At the bottom of the graph, at 3n/2, we again 
have a slope of 0. At x = 2 jt, we’re almost back to where we started, 
and we again have a slope of 1 . 

C. The pattern of these slopes, 1 , 0, - 1 , 0, 1 .... will repeat forever. If we 
connect the dots of the slope function, we see that it looks very much 
like the cosine function. 

VI. We know that the derivative of the sum is the sum of the derivatives, but is 
the derivative of the product the product of the derivatives? Unfortunately, 
the answer is no. Instead, the derivative of the product is “the first times the 
derivative of the second plus the derivative of the first times the second.” 

A. The product rule is written as follows: Ify =fix)g{x), then 
/= f(x)g'(x) + /' (x)g(x). 

1. For example, if we’re looking aty = x 2 sin x, we know the 
derivative of each of x 2 and of sin x, and we can use that to find the 
derivative of their product. 

2 . The derivative of the product is the first times the derivative of the 
second, which would be x 2 cos x, plus the derivative of the first times 
the second, which would be 2x(sin(x)). When you add those 
together, you get the derivative: x 2 cos x + 2x sin x. 

B. The quotient rule is shown at 
right. To remember it, instead of 
thinking/(x)/g(x), think high 
over low, or “hi” over “ho.” 

Then, you can remember y' as: 

ho-di-hi minus hi-di-ho over ho-ho. 

1. For instance, suppose we were constructing an elementary model of 
planetary motion using a yo-yo moving at constant speed. 

2 . The tangent of x could tell us the slope of the string when the 
yo-yo is at time x, and the derivative of the tangent of x could tell 
us how fast that slope is changing at time x. We want to calculate 
the derivative of the tangent of x; that’s sin x/cos x. By the quotient 

, . , , , . . cosxsin'(x)-sinxcos'(x) . . , 

rule, sin x/cos x has derivative - , which 

(cos(x)) 

. cos 2 x + sin 2 x 1 _ 2 

is = - — . Thus, tan x has derivative 1/cos" x. 

cos 2 x cos 2 x 

VII. As we said at the outset, calculus is the mathematics of how things grow. In 
general, there are three ways that functions grow. 

A. Functions may have a constant growth; those functions are represented 
by straight lines. Functions may also grow in proportion to their input. 

For example, a falling body travels faster and faster according to how 


Iff’ = /(x)/g(x), then 
, g(x)/'(x) — /(x)g'(x) 

«M 2 
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long it has been traveling. Finally, functions can grow in proportion to 
their output. Those functions describe how your bank account grows or 
how the population grows. 

B. All of these functions are described by differential equations, which 
sometimes involve taking derivatives of derivatives, called second 
derivatives. Mathematics, being the language of science, is actually 
expressed through differential equations. For instance, these equations 
can describe pendulum motion, vibration, pacemakers, even the beating 
of your heart. In fact, it’s safe to say that, on some levels, your life 
actually depends on calculus. 

Reading: 

Colin Adams, Joel Hass, and Abigail Thompson, How to Ace Calculus: The 

Streetwise Guide. 

Silvanus P. Thompson and Martin Gardner, Calculus Made Easy. 

Questions to Consider: 

1 . A manufacturer wants to create a can that will contain 1 liter of liquid. Use 
differential calculus to determine the dimensions of the can that will 
minimize the surface area of the can. (Hint: A cylinder with base radius r 
and height h has volume nfh and surface area 2nrh + 2nr 2 . CAN you see 
why?) 

2. For the function y = x 3 , what is the slope of the tangent line that passes 
through the point (2,8). What is the equation for that line? 

3. Find the dimensions of a rectangle with perimeter P that has the largest 
area. 
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Lecture Nineteen 

The Joy of Approximating with Calculus 

Scope: In the previous lecture, we began our study of calculus by calculating 
the slope of tangent lines for many functions, leading to many 
interesting applications. In this lecture, we’ll see how this simple idea, 
the slope of the tangent line, has beautiful consequences. We’ll also 
learn a new technique of differentiation called the chain rule, which 
shows how to approximate any function by a polynomial. Because 
polynomials are the simplest kinds of functions to work with and we 
know how to differentiate them, this tool is especially useful. 

Outline 

I. We begin with the chain rule, which refers to chains of functions. 

A. We know, for example, that the derivative of sin x = cos x. We also 
know that the derivative of x 3 = 3x 2 . Suppose we want to combine those 
two functions and calculate the derivative of sin(x 3 ). 

1. You might guess that the derivative of sin(x 3 ) is cos(x 3 ) or cos(3x 2 ). 
Both answers are wrong, but they’re close. The actual answer is 
cos(x 3 )(3x 2 ). 

2 . In general, if we want to take the sine of g(x ) and find the 
derivative of that function — sin(g(x)) — the derivative is equal to 
cos (g(x)) g (x), or g (x) cos (g(x)). 

B. Let’s do another example. Recall that the derivative of the function e x is 
still e‘. What about the derivative of e x ? 

1. The chain rule tells us that the answer is e x times the derivative 
of x 3 , which is 3x 2 ; thus, the derivative we’re looking for is 3x 2 e x . 

2 . In general, e 8(x} has derivative g' (x) e g(x) . 

3 . We can also improve the first differentiation rule we learned, the 
power rule. According to this, if y = x”, then the derivative of 

x" = nx” 1 . The derivative of [g(x)] n would be n[g(x )] n ~ 1 
times the derivative of g(x). That is, if y = [g(x)]' ! , then 
y' = n[g{x)] n ~ [8 (x). 

4. For instance, let’s calculate the derivative of (x 3 ) 5 . According to the 
chain rule, that’s 5(x 3 ) 4 (3x 2 ) = 15x 12 x 2 = 15x 14 as the derivative. 

5 . We can verily this answer because the problem started off as (x 3 ) 5 , 
which is just an unusual way of writing x 15 , and we know from the 
power rule that the derivative of x 15 is, indeed, 1 5x 14 . 

6. In general, the chain rule says that if we have a function of a 
function, y = /fg(x)), then y' = f (g(xj) g' (x). 
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C. Let’s now use the chain rule to solve the following cow-culus problem: 
Claudia the cow is 1 mile north of the A-Axis River, which runs east to 
west. Her bam is 3 miles east and 1 mile north. She wishes to drink 
from the A- Axis River, then walk to her bam in such a way as to 
minimize her total amount of walking. Where on the river should she 
stop to drink? 

1. If she starts at the point (0,1), her bam is at (3,2). 

2. Suppose she decides to drink from the point x along the river, that 
is, at the point (x,0). As she walks from her starting point to x, she 
creates a right triangle with one leg of length 1, base length x, and 

hypotenuse length \]x 2 + 1 . 

3. Then, Claudia has to walk from point x to her bam. That’s another 
triangle where the base has length 3 - x, the height is 2, and the 

hypotenuse is -y/(3-x) 2 +2 2 . When we expand that, we have 

Vx 2 -6x + 13 . The total distance that Claudia walks when she 
stops at x is 

/x) = (x 2 + 1 ) 1/2 + (x 2 - 6x + 13) 1/2 . 

4. By the chain rule, /' (x) = ‘A(x 2 + l)” l/2 (2x) + !4(x 2 - 6x + 1 3) 1 2 
(2x - 6). 

5. We want to find the place where the function /x) is minimized, 
and to find such a point, we need to find where the derivative is 0. 

6. The solution to this equation is x = 1 , which we can verify. 

7. I gave you the solution of x = 1 , but how could we have derived it? 
The fact is that this problem can be solved, if you’ll pardon the 
pun, after just a moment’s reflection, without ever using calculus. 

a. Imagine that Claudia, as she walks from her original position 
to the A- Axis River, instead of walking back to her bam at the 
coordinates (3, 2), walks to the barn’s reflection at the point 
(3,-2). 

b. Notice that the distance from her drinking point to (3, 2) is the 
same as the distance from her drinking point to (3, -2). 

c. Since the shortest distance between two points is a straight 
line, to find the optimal path, we draw a straight line between 
the original point at (0, 1) to the reflected point at (3, -2). The 
slope of that line is -3/3 = -1 . If the line starts at the point 

(0, 1), then it will hit the x-axis at the point where x = 1 . 

II. Now let’s look at a way to approximate the square root of any number in 
your head. Our tool for this is the all-purpose approximation formula. 

A. The all-purpose formula works for almost any differential function. It 
says :J[a + h) ® /(a) + h f (a). Generally, the smaller h is, the better 
the approximation is. 
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B. The reason this formula works is fairly simple. If we go back to the 

original definition of the derivative, /' (a) ® - - - --- — . As h 

h 

goes to 0, that approximation becomes exact. Let’s now use this 
approximation formula to calculate square roots in our heads. 

C. We know that the function f{x) = sfx , has derivative /' (x) = — . In 

2 vx 

particular, if we plug in the value x = a, we get /' (a) = — . 

2 y/a 


D. 


Let’s say we want to estimate yj \ 06 . We can break 106 into 100 + 6, 
and we’ll let a = 100 and h = 6. Our approximation formula tells us that 


Vl06 =7(106), *7(100) + 6 7"' (100). But 7(100) is VlOO = 10. To 

6 


that, we add 6 /'(100) : 


2VT00 

10 + 6/20 = 10.3. As it turns out, Vl 06 = 10.295.... 


= 6/20. Hence, our approximation is 


E. Let’s do another example: >/456 . We know that %/400 = 20 , so our 
first guess is 20 plus an error of 56. We take 
20 + 56/2(20), which equals 20 + 1.4 = 21.4. 

1. We can get an even better approximation using the process for 
squaring numbers that we learned in one of our lectures on algebra. 
Using this process, we know that 2 1 2 = 441, which makes our error 
smaller; h is only 15 instead of 56. 

2. In this case, we calculate -v/456 as 21 + 15/2(21) = 21 + 15/42 = 
21.357. The exact answer is 21.354. 


III. Let’s return to the approximation formula that says J(a + h) ~f(a) + hf (a). 

A. We plug in a = 0 and replace h with x to get a much simpler looking 
equation. This says /(x) » f(0) + f (0)x. Once we have the function f 
/(0) is just a number, as is /' (0). I f/(x) is some number plus some 
other number times x, that’s the equation of a line with a slope of 

/' (0). That line goes through the point (0,7(0)). In other words, we’re 
approximating the function /(x) with a straight line that goes through 
the same point, (0,7(0)), with the same slope. 

B. Let’s look at the graph of y = £\ near the point (0,1), we have a line 
(actually, it’s the liney = 1 +x) that looks just like the function e x , at 
least when x is close to 0. 

C. If we want an even better approximation, then we look for a parabola, a 
second-degree polynomial, to go through the same point. Because we 
have one extra degree of freedom, not only will the parabola go through 
that point with the same slope (the same first derivative), but the 
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parabola will also go through that point with the same second 
derivative. 

f"(0)x 2 

D. The magic formula for that is/x) *flO) + f (Ofx + ^ . Now we 

have a parabola that matches the function with the same first derivative 
and second derivative. We call this the second-degree Taylor 
polynomial approximation. 

E. If we want an even better fit, we can get a third-degree approximation 

/'"'(Ofx 3 

by adding a cubic term: — . The reason we use 3 ! is that now 

that function will match the original function through the point (0, /f0)) 
with the same first derivative, second derivative, and third derivative. 

F. We can also bring this function out to even higher degrees with the 
same kind of formula. 

1. Let’s use the function^) = /; we choose this function because 
/' (x), its first derivative, is /. The second derivative is also / , 

as is the third derivative. When we plug those in at 0,/(0) = 1, 
f (0) = 1, f (0) = 1, and f” (0) = 1 . That tells us that near the 

point x = 0, e* is approximately 1 + x + x 2 /2! + x 3 /3!. 

2. We’re approximating the important function / by a cubic 
polynomial, and when we’re close to 0, it’s a pretty good fit. 

3. The n th -degree Taylor polynomial would be 

1 + x + x 2 /2! +. . .+ x"/ nl . If we let n go to infinity, we get perfect 
accuracy for all values ofx. This is called the Taylor series ofx, 
and it has amazing consequences. 

4. For instance, look what happens when we differentiate the Taylor 
series for / (which is 1 +x+x 2 /2! +x 3 /3!...), one term at a time: 
The derivative of 1 with respect to x is 0. The derivative ofx is 1. 
The derivative ofx 2 /2! isx. The derivative ofx 3 /3! is 3x 2 /3!, but the 
3’s cancel and we’re left with x 2 /2 ! . The derivative of x 4 /4! is 
4x74!, which is x 3 /3 ! . When we differentiate the terms of the series 
for e\ we get / again, which makes sense because the derivative of 
/ is /. 

IV. Let’s look at some more important Taylor series, which can be derived in 
the same way that we derived the / series. 

A. For instance, sin x has the following Taylor series: x -x 3 /3! +x s /5! - 

x 7 /7! + x 9 /9 ! This looks just like the odd terms of the / series except 

that the signs alternate. Let’s look at the graph of y = sin x and its 
approximation with the function y = x. The function x - x 3 /3 ! is an even 
better approximation, and the fifth-order Taylor approximation, 

x - x 3 /3! + x s /5 ! , is even better. 

B. We can figure out the series for cos x by differentiating the series for 
sin x. We know that the derivative of sin x is cos x; differentiating the 


terms of the Taylor series for sin x, we get the series for cos x, namely 

1 -x 2 /2! + x 4 /4! — x 6 / 6 ! Those are the even terms of the e x series, 

again with the signs alternating. 

V. Now, let’s have some more fun with functions. 

A. Look at the series for e~ x ; that’s what we get when we take the / 
series and replace all the x’s with -x. This gives us e x = 1-x + x 2 /2! - 
x 3 /3! + x 4 /4! -.... Thus, the e~ x series looks like the / series except the 
signs are alternating. 

B. If we add the e x series to the e x series and divide by 2 (taking the 
average of those two functions), we get the hyperbolic cosine function, 
or cosh function. That is, cosh x = (/+ e x )/2. 

1. Look what happens when we add those series together: The odd 
terms cancel, and the even terms stay the same. Thus, 
coshx= 1 +x 2 /2! +x 4 /4! +x 6 /6! +.... 

2. One reason that’s called the hyperbolic cosine is that its infinite 
series looks just like the infinite series of the cosine function 
except that the cosine function has alternating signs. 

C. Similarly, if we subtract those two infinite series, the odd terms survive 
and the even terms are eliminated. We’re left with sinh x = (e x - e*)l7 = 
x + x 3 /3! +x 5 /5!.... It looks just like series for the sine function except 
that it doesn’t have alternating signs. That’s called the hyperbolic sine 
function, or sinh function. 

D. Notice also that sinh'x = cosh x and cosh'x = sinh x . 

E. We see hyperbolic functions everywhere in our daily lives. For 
instance, a hanging cable or piece of rope always fits a cosh curve. In 

1 f 

fact, every hanging rope or chain is of the formy = —cosh — . Note 

a V a ) 

that to differentiate this function, we would use the “chain rule.” 

F. Where does the word hyperbolic come from in these functions? We 
know that (cos 0, sin 0) exists on the unit circle since cos 2 + sin 2 =1. 
Similarly, we can show that cosh 2 - sinh 2 = 1, which means that 
(cosh 0, sinh 0) lies on the unit hyperbola, and that’s where the word 
comes from. 

G. Another easy property to verify is that cosh x + sinh x = /; that can be 
verified by the series or by the original definition. 

VI. We’ve seen a number of parallels between the hyperbolic functions and the 
trigonometric functions, and if cosh + sinh = /, then there must be some 
connection among cosine x, sine x, and /. 

A. The connection is Euler’s equation: e ,x = cos(x) + i sin(x). 

B. We could prove that by the series for / , replacing all the x’s with /x’s. 
As that i is raised to different powers (r = 1, r = i, P = -1, ? = -1, 

i 4 = 1) then the sign pattern is: 1, i, — 1 , — z, 1, i, -1, -/. As we look at that 
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pattern and separate the real part from the imaginary part, we get the 
series for cos x plus i times the series for sin x. That’s the proof of 
Euler’s equation: e“ = cos x + i sin x. 

C. Incidentally, as we observed earlier, when we let x = 7t or 1 80°, then 
e ' n = -l, that is, e' n + 1=0. This equation was recently listed as number 
two on a list in Physics World magazine of the 20 greatest equations. 

Reading: 

Colin Adams, Joel Hass, and Abigail Thompson, How to Ace Calculus: The 

Streetwise Guide. 

Silvanus P. Thompson and Martin Gardner, Calculus Made Easy. 

Questions to Consider: 

1. What is the value of 1 - 1/1! + 1/2! - 1/3! + 1/4! - 1/5! + ...? 

2. Use the approximation formula to derive a method for mentally determining 
good approximations to Ija + h , where a is a number with a known cube 
root. For example, come up with a good mental estimate of V 1024 . 


Lecture Twenty 
The Joy of Integral Calculus 


Scope: Geometry and trigonometry allow us to compute the area of simple 
geometrical figures, such as triangles and circles, but how do we 
measure the area or volume of more irregularly shaped objects? 
Calculus comes to the rescue. By adding lots of tiny quantities with 
simple areas (a process known as integration), we can find solutions to 
many practical problems. These calculations are streamlined through 
the fundamental theorem of calculus which fundamentally relates the 
area under a curve to the curve’s anti-derivative. 

Outline 

I. Calculus is typically broken into two parts: differential calculus and integral 
calculus. Differential calculus, as we’ve studied in our last two lectures, is 
the mathematics of how things change and grow. Integral calculus is used, 
among other things, to calculate areas and volumes. 

A. The big idea in both differential calculus and integral calculus is to 
calculate quantities associated with curves using quantities associated 
with straight lines. For example, in differential calculus, we used our 
understanding of the slope of a straight line to calculate the slopes of 
parabolas and trigonometric functions. 

B. In integral calculus, where the goal is to calculate areas, we’ll begin by 
looking at areas we understand, such as the area of a rectangle, and use 
that knowledge to figure out, for example, the area under a curve. 

C. Initially, you wouldn’t expect to find much of a connection between the 
calculation of slopes and the calculation of area, yet those two concepts 
are intimately connected through the fundamental theorem of calculus. 
We’ll begin this lecture by looking at that theorem. 

II. The original problem of integral calculus is to find the area under some kind 
of a curve, and we can answer questions about that area using the 
fundamental theorem of calculus. 

A. Suppose we want to carpet a room that is mostly rectangular but has a 
curved section described by the function y =/(x). According to the 
fundamental theorem of calculus, to find the area of that region, we 
first have to find a function, F(x), with F'(x) =f{x). Once we’ve found 
that function, we calculate the area with the formula: F(b) - F(a). 
li. Let’s look at a specific example, the parabola described by the function 
y = x 2 . Suppose we want to find the area under the curve as x goes from 
I to 4. The first step in the fundamental theorem of calculus is to find a 

function with F\x) = x 2 . 
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1 . If we differentiate x 3 /3 , we know from the power rule that we get 
3 x 2 /3. The 3’s cancel and we’re left with x 2 . Thus, .AX) = xl 3. 

2. The next step is to plug in the endpoints to the function we just 
found. In other words, we calculate F(4) - F( 1) = 4 /3 — 1 /3 — 

64/3 - 1/3 = 63/3, which is exactly 21. 

3. Therefore, by the 
fundamental theorem of 
calculus, the area under the 
parabola between 1 and 4 is 
exactly 21. The notation we 
use for this is shown at right. 

4. In this lecture, we’ 11 see how 
to interpret integrals as 
infinite sums. 

C. Let’s do another example. We’ll 
calculate the area under the curve for the function y = sin x as x goes 
between 0 and n. 

1 Before we calculate, we do a bit of guessing. We know that the 
sine function, at its peak, has a height of 1 . We could enclose that 
entire curve inside a rectangle that has a height of 1 and a length o 
jt; thus the area under the curve can’t be bigger than n. 

2. To apply the fundamental theorem, we must find a function whose 
derivative is sin x, and that function is F(x) cos x. We then 
evaluate F(fi) - F(0) = -cos(ti) + cos(O) = -(-1) +1=2; the area 

under the curve is 2. 

D. If we’re looking at a curve that goes above and below the x-axis, then 
we have to interpret the integral slightly differently. For instance, if 
we’re looking at the function y = sin(.t) as x goes from 0 to 2jt, then the 
area below the x-axis is counted negatively. 

1. With that information, what would you expect to find for 

2n 

j" sin xdx ? Is there more area above the curve, more area below 

0 

the curve, or are they equal? Because the function looks 
symmetrical, we would expect the positive part and the negative part 
to cancel each other out and give us an answer of 0. 

2. Let’s apply the fundamental theorem of calculus to see if we get 
that answer. The anti-derivative of sin x is still -cos x. We evaluate 
this at the endpoints 0 and In, but cos(2n) is the same as cos 0, so 
they cancel each other out exactly. Hence, this integral results in 0, 
as expected. 

III. What is it that makes the fundamental theorem of calculus do its magic? 

A. Before we answer that, let’s look at a different question: Suppose we 
have two functions that have the same derivative. Must those two 



The symbol J is an 

elongated “s,” where “s” 
stands for “sum.” 


functions be the same? If /'(x) = g'(x) , does that mean that/(x) = g(x)? 
The answer is: almost, but not quite. 

1. For example, what functions have the derivative 2x? We know that 
x 2 has a derivative of 2x, as do x 2 + 1, x 2 + 1 7, and x 2 - Jt. 

2. Anything that’s of the form x 2 + c has a derivative of 2x, and the 
only functions that have a derivative of 2x are of the form x 2 + c. 

3. Try to remember this theorem: If two functions have the same 
derivative, then those two functions differ by a constant. 
Mathematically, if /'(x ) = g'(x) , theny(x) = g(x) + c. 

B. Knowing this theorem, we’re ready to answer the question: What 
makes the fundamental theorem of calculus do its trick? Our goal is to 
prove that if we have a function y =fix) and we want to find the area 
under the curve between the points x = a and x-b, we find a function 
F whose derivative is f then evaluate F(b) - F(a) to find the area. 

C. We begin with the quantity R(x), which is the area of the region under 
the curve between a and x. Notice that as we vary x, the region under 
the curve also varies, and its area will vary. 

1. What if we move x on top of a? The area of the region, then, is 0. 
We’re looking at a straight line, which doesn’t have any area. 

2. Thus, R{a) = 0, as will be useful later. 

D. Our goal with the fundamental theorem is to show that the area under 
the curve from a to b is F(b) - F(a). But, by definition, the area under 
the curve from a to b, is R(b). Thus, the goal of this theorem is to 
conclude that R(b) = F(b) - F(a). How are we going to get there? 

1. Remember, R(x) is the area under the curve as we go from a to x. 
What’s R(x + A)? By definition, that is the area under the curve as 
we go from a to (x + h). The difference in those quantities, R(x + h) 
- R(x), is the area as we go from a to (x + h ) minus the area as we 
go from a to x. Almost everything gets canceled there except for 
the tiny region between x and (x + h). 

2. Looking at a blowup of that region, we see that if h is really small, 
the region is almost rectangular, and its area, then, is 
approximately the area of a rectangle with base h and height /fx); 
thus, its area is approximately h multiplied by fix). 

3. Dividing both sides of this equation by h, we get (R(x + h) - 
R(x))/h = fix). As we let h go to 0, the expression becomes 
R' (x) =fix). And since F'(x) = fix), we have R\x) = F'(x) . 

4. As we said earlier, if two functions have the same derivative, they 
differ by a constant; therefore, R(x) = F(x) + c. That constant must 
work for every value of x that we plug into it; in particular, it must 
work when we plug in the value x = a. 

5. If we plug in x = a, then R(a) = F(a) + c. Remember, though, that 
R(a) = 0. Solving for c, we find that c = -F(a). Plugging that value 
into the formula above, R(x) = F(x) - F(a) for all values of x. 
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6. Because that works for all values of*, in particular, it must work 
for x = b\ therefore, R(b) = F(b) - F(a). 

IV. Motivated by the fundamental theorem of calculus, here are some 
techniques for finding anti-derivatives of functions. 

A. We use the following notation for anti-derivatives: J / (x)cix , which 
represents the set of all functions that have derivative/*). 

B. For example, J 2xdx is simply asking for all functions that have a 
derivative of 2x. We know that all functions with a derivative of 2* are 
of the form x 2 + c. Thus, ^Ixdx = x 2 + c. 

V. Let’s look at some other rules for calculating integrals. 

A. The power rule for derivatives has a reverse power rule for finding 

integrals: jx"t& = ^_ + c . For example, the reverse power rules 

says that Jx 3 r& = x 4 /4 + c. Multiplying through by constants— real 
numbers— is as easy as it was for derivatives. Since jx 3 = x 4 /4 + c, 
then |7x 3 = 7x 4 /4 + c. 

B. Recall that the derivative of the sum was the sum of the derivatives. 

The same sort of rule works for anti-derivatives. That is, the integral of 

the sum is the sum of the integrals. For instance, we know jV* 3 

and J2x ; therefore, we can calculate |7x 3 + 2x just by adding our 

previous answers. That would be 7x 4 /4 + x + c. 

C. Unfortunately, as we saw with derivatives, the integral of the product is 
not the product of the integrals. There are some techniques of 
integration, however, that can help us do these kinds of problems. 

1 

1. The equation for a typical bell curve is:/*) - 

2. The bell curve is used to describe numerical quantities, such as 
exam scores or heights and weights. If we want to find the average 
value of something that came from a bell-shaped region, then we 

need to calculate an integral, such as jxe' x dx . 

3. We’U calculate this integral using the method of integration by 

guessing. As a guess, we might say it equals e x . By the chain 
rule, when we differentiate that, we get (-2*) e . If it weren t for 


the -2, we’d have the answer exactly. If we divide through by -2 
in our original guess, however, we get: \xe x dx = -l/2e x +c. 


D. 


What if we wanted to find the area between two different points on a 
bell curve, such as the area between -1 and 2 under the bell curve 

2 

e~ x ? The fundamental theorem of calculus tells us to find an anti- 


_2 

derivative. Unfortunately, this function, e , has no simple anti- 
derivative. We have to resort to the naive idea of calculating the area by 
summing up a number of rectangles, at least theoretically. 

b 

1. The notation /(*) dx comes from summing a group of rectangles. 


2 . 


3. 


Imagine breaking up a region from a to b into a bunch of little 
rectangles. We draw a rectangle that starts at the bottom at the 
point (x,0) and goes to the top of the curve to the point (*,/*)) 
with a height of/*); its base is A*. The area of that rectangle is 
fix) Ax. 

If we continue to draw rectangles so that we completely cover that 
spectrum as * goes from a to b, then we’re literally summing 

b 

values of the form: y /(x)Ax . 

x-a 


As the widths of those rectangles get smaller and smaller, we get 

b 



thus, when those Ax’s go to 0, the A* becomes a dx. 


a 


E. Let’s put this into practice by calculating the area of a circle. We can do 
this simply by adding up the areas of all the little rings inside. 

1. The large circle has a radius of R. We extract one ringlet of that 
circle, which has a radius of r and a circumference of 2 nr. We can 
flatten that ringlet out and look at the area of the edge, whose 
length is 2nr and whose thickness is A r. The total area will be the 
sum of 2nr(Ar) as the radius goes from 0 to R. As A r gets smaller, 

R 

that sum becomes the integral j2nrdr . 

o 

2. We know the anti-derivative of 2nr is F(r) = nr 2 . Hence the area of 
a circle is F(R) - F( 0) = nR 2 - jiO 2 = nR 2 , exactly as expected. 

I he use of the word integration in mathematics comes from the fact that we 
can answer a big problem by breaking it up into smaller, simpler problems, 
then putting the simple answers together. 

\. For example, we can use integration to figure out the volume of a 
sphere. One way to create a sphere is by taking a flat circle, such as a 
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lid, and rotating it around the x-axis. Then, we can calculate the volume 
by chopping the sphere into tiny parts. 

B. Chopping off one tiny part, we have a circle with a little bit of 
thickness, a radius of y, and an area of n y 2 . If we call the thickness Ax, 
then the volume of this small piece is ny (Ax). 

C. Because the equation of the original circle was x 2 + y =R ,we can 
replace y 2 with R 2 - x 2 . Thus, the sum of Jt/(Ax) can be written as the 
sum of k(R 2 -x 2 )Ax. We’re summing this as x goes from -R to +R. 

D. In other words, as we let the widths of those slices get smaller, the 

R 

volume is equal to j n(R 2 -x 2 )dx . Finding the anti-derivative of that 

-R 3 

is a fairly simple matter. When we do the algebra, we get 4/3 Jiff , which 

is the volume of a sphere. 

VII Integrals can calculate areas and volume, but also other physical quantities, 
such as center of mass, energy, and fluid pressure. In fact, along with 
differential equations, they describe everything from heat to light to sound 
to electricity. Without a doubt, calculus is an integral part of our daily lives. 

Reading: 

Colin Adams, Joel Hass, and Abigail Thompson, How to Ace Calculus: The 
Streetwise Guide. 

Silvanus P. Thompson and Martin Gardner, Calculus Made Easy. 

Questions to Consider: 

1. Using the chain rule, fmd the derivative of In x, In 3x, and In lx. Explain 
what you see. 

2. Verify the calculation expressed by this limerick: 

The integral z 2 dz 

From 1 to the cube root of 3, 

Times the cosine 
Of 3 pi over 9 

Is the log of the cube root of e. 


Lecture Twenty-One 
The Joy of Pascal’s Triangle 


Scope: We now turn from calculus to playing with numbers again. As we saw 
in our lecture on the joy of counting, if we place the binomial 
coefficients in a triangle, we discover many magical properties that we 
can explore and derive by counting arguments and the binomial 
theorem. By summing the rows, columns, and diagonals of the triangle, 
we discover powers of 2 and hockey sticks. Even the Fibonacci 
numbers make a surprise guest appearance. Ultimately, we answer the 
question: In the “Twelve Days of Christmas” song, how many gifts did 
my true love give to me? 


Outline 

I. The next three lectures are devoted to topics in probability. We’ll use some 
calculus in these lectures, as well as some discrete mathematics that 
depends on one of the most beautiful objects in mathematics, Pascal’s 
triangle. 

A. Let’s begin by looking at the first six rows of 
Pascal’s triangle, labeled 0 through 5. We 
create numbers in this triangle by adding two 
consecutive numbers in a given row to produce 
the number below. These numbers are denoted 
T(n, 0), T(n, 1 ), ... T{n, n). For instance, in row 
4, we have T{ 4, 0) = 1, 7(4, 1) = 4, 7(4, 2) = 6, 
and so on. 

B. The rule for creating the rows of Pascal’s triangle is: T(n, 0) = 1, 

T (n, ri) = 1, which says that the row begins and ends with a 1, and for 
T(n, k), we take T(n — 1 , k — 1 ) + T(n - 1 , k). The 1 0 that appears in row 
5 would be known as 7(5, 2) and that’s equal to 7(4, 1) + 7(4, 2). 

C. We can use this rule to create rows in the triangle. For instance, row 6 
would begin with a 1 ; then the 6 would be obtained by adding 1+5. 
Then, we add 5 + 10 = 15, 10 + 10 = 20, 10 + 5 = 15, 5 + 1 = 6, and we 
end with a 1 again. 

II. I ,ef s take a look at some patterns inside the triangle. 

ft A. For instance, notice that each row is symmetric. It reads the same way left 
to right as right to left. Formally, we say l\n, k ) = 1\n, n - k). 

B. If we were to add the numbers in the triangle row by row, we see that 
row 0 adds to 1 , row 1 adds to 2, row 2 adds to 4, row 3 adds to 8, and 
so on. Those are powers of 2; in general, row n sums to the number 2", 
or 1{n, 0) + T{n, 1) +...+ T(n, ri) = 2" . 


0 

1 

1 

1 1 

2 

1 2 1 

3 

13 3 1 

4 

1 4 6 4 1 

5 

1 5 10 10 5 1 
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C. I call the next pattern the hockey stick identity. It occurs when we add 
the diagonals in the triangle. For instance, when we add 1 + 3 + 6 + 

10 + 15 + 21 + 28, we get 84, which lies below and to the right ot 28. 
This is the hockey stick identity because of its shape: a long stick that 
juts out in a new direction to give the next entry of the triangle. This 
rule works whether we’re adding diagonally going to the left or the 

right. 


HI. We can understand some of these patterns through combinatorics. 


A. Mathematicians typically define 




l A 


as the number of size k subsets of 


the numbers 1 ... n; we defined it as the number of ways to choose k 
objects from a group of n objects when order is not important. For 
instance, if I have n students in my class and I need k of them to form a 

_ n ' committee, then the number of ways to create that 

k\(n-k)\ 

committee is 


'n' 


\ k ) 


fn\ 


V k J 


B. We saw the formula for solving this earlier. But for k < 0 or k>n, we 
don’t even think of the formula; we just think of the definition, and we 
get 0. In other words, how many ways could I create a committee with 
-5 students? Of course, the answer is 0. 


C. How does 




l A 


relate to Pascal’s triangle? I claim that T\n, k) - 


(n\ 


l A 


1 . Looking at the first five rows of the triangle, we can see the terms 


as 




l A 


2. If we calculate 


; thus, row 4 (1, 4, 6, 4, 1) is 


A 

f 4 l 


A 


A 


A 

,0, 

V, 

* 


9 

A 

9 

,4, 


A 


by the formula, we get 


4! 


24 


= 6. At 


2 !( 2 !) 2 ( 2 ) 

least in the first five rows, it looks as if my claim is true. Let s 
prove this idea. 


D. We know that the boundary numbers for 

Thus, the boundary conditions are as expected. 


V 





A 

satisfy 


=1; 



= 1. 


E. The triangle condition was T(n, k ) = T(n - 1, k- 1) + T(n -- 1, k). Will 
that growth condition, or recurrence relation, remain true as we look at 



V 




'«-r 


"n-l'| 

the numbers 

A 

? Can we show that 

A 


s~h 

+ 

, k , 


F. One way we can show this is true is by using algebra. That is, we add 
the terms using the factorial definition; we then put those terms over a 
common denominator of k\(n - k)\, add the fractions, and when the dust 

n\ 


settles, we get 




k\(n — k)\ 


, or 


\ K J 


We can also use a combinatorial proof. Returning to the original 
question, from a class of n students, how many ways can I create a 
committee of size kl On the one hand, we know the answer to that 


question is 


( n' 


\k j 


H. On the other hand, we can answer that question through something 
known as weirdo analysis. 


1. Imagine that student number n is the weirdo. Among the 


f n ' 


vA 


committees, how many of them do not use the weirdo? We’re 
looking at size k committees from the class of students 1 through 

f 

n - 1 

n - 1 . By definition, that’s 

2. How many of those committees must use student ril If student n is 
on the committee, then we must choose k - 1 more students to be 


ik=0 


'n' 


vA 


■ 2" on the committee from the remaining n - 1 

'n-\' 


students. Again, by definition, we’re looking at 


k - 1 



' n-X' 


'n-r 

3. There are 


committees without the weirdo and 



< k / 




with 


the weirdo; their sum is the total number of committees. 
4 . Hence, the number of size k committees is 


r n-V 


r n-V 


+ 




, k > 
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I. 


J. 


Comparing our two answers to the same question, we get 


V 


n-l' 


b-l' 


— 


+ 


A 




c k > 


We’ve shown that the 


terms (called binomial coefficients ) have the 


same boundary conditions as Pascal’s triangle. They will continue to 
grow in the same way as the entries of Pascal’s triangle; therefore, they 
are the elements of Pascal’s triangle. 


IV. All the patterns of Pascal’s triangle can be expressed in terms of binomial 
coefficients. 

A. For example, let’s look at the pattern we saw earlier, that the elements 
of row n sum to 2”. In terms of binomial coefficients, this says 






/ M \ 


V 1 J 


+ 




v2 , 


'n' 


+ ...+ 


= 2 ". 


B. We express this idea using 
sigma notation, shown at 
right. Sigma is the Greek 
letter E and is read: “the sum 
as k goes from zero to n 
of...”. 

1 . Here’ s our combinatorial 
proof, beginning with 
the question: How many 
committees can we form 
from a class of size n? 

We can break up the 
question by considering 
the size of the committee 
and adding the answers, 
as shown at right. 

2 . Now we ask: Why is the 
number of committees 2”? We can answer this using the rule of 
product. To create a committee, we go through the classroom 
student by student and decide whether or not each student will be 
on the committee. For each student, we have two choices, on or 
off, from student 1 up through student n. That’s 
2x2x2x2x2x...x2 n times, or 2" ways to create a committee. 

V. Another useful theorem in mathematics is the binomial theorem, which we 
can find inside Pascal’s triangle. ^ 

A. Remember this equation from basic algebra: (x + yf = x 2 + 2 xy + V ,Jis ^ 
appears in row 2 of Pascal’s triangle (1,2, 1). We see that (x+y) - x 


Committees of size 0 = 

Committees of size 1 = 
Committees of size 2 = 


'n' 


v°y 

f n 


vly 

f n 


V2, 


Total committees - Yl , _ n 


r n \ 


ykj 
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+ 3x 2 v + 3.yv" + and the coefficients in that expression are row 3 of 
the triangle (1,3,3, 1). The expression (x +y) 4 would be x 4 + 4x 3 y + 
6x>' 2 + 4xy 3 + /, and those coefficients are row 4(1,4, 6, 4, 1). In 
general, for (x + y) n , the coefficients are the numbers in the « ,h row of 


Pascal’s triangle. Specifically, the coefficient of x*/ * is 




\b 


B. We can think of (x + y)" as (x + y)(x + y)(x + y)(x + y). . . n times. 
There’s only one way to get an x* term, and that’s by taking x from the 
first expression times x from the second expression times x from the 
third expression, all the way down to x from the last expression. 

C. There are n ways to create an x" l y term simply by deciding which _y’s 
we will use, then letting the rest of the terms be x’s. 

D. Forx" 2 y , we choose two terms to bey’s and all the restx’s. There are 
ways to pick twoy’s here; thus, the coefficient ofx" 2 y 2 is 




v2y 


r n \ 


v2y 


To summarize, the binomial theorem says: (x +y) n = ^ ; 






pc*/"*. 


This simple formula can be applied to produce many beautiful identities. 
A. For example, if we let x = 1 and y = 1, the binomial theorem tells us 


that "S' 

■£-Me=0 


r„\ 


Kb 


= (1 + 1 )" = 2 " . 


Z n 

A 0^ 




\ k ; 


= n2 


n - 1 


1. One way to prove this is to let y = 1 in the binomial theorem; thus: 
(x+iy=Y" 


r n \ 


\b 


= X 


2 . Let’s differentiate both sides of this equation with respect to x. 
When we differentiate the left side, we get n(x + l)" -1 . When we 
differentiate the right side, each summand has derivative of 






kx k 1 . Hence: n(x + 1)" 1 = Y. , 


r n K 




kx‘ 


k - 1 


3 . When we set x = 1, all the x’s disappear, and we’re left with 


IL 




\b 


= n2 


n - 1 
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We can prove this same theorem combinatorially. For example, from a 
class of n students, how many ways can we create a committee of any 
size with a chair? 


1. If the committee has size k, there are 


'n 


y" 1 


ways to create the 


committee. Once we’ve done that, there are k ways to choose the 
chair of the committee. Thus, the number of committees of size k 


with a chair is 


'n\ 




k ; the total number of committees over all 


Z n 

k 


( w'l 


2. There’s also a more direct way of answering this question. To 
create a committee of any size from a class of n students, first, we 
have n ways to pick a chair. Once we’ve done that, we have to 
choose a subset of the remaining n - 1 students to serve on the 
committee. How many possible committees can we form from the 
remaining n - 1 students? As we saw earlier, that’s 2” '. The 
number of committees of this type, then, is n2 n . 

VII. Let’s look at some other patterns in Pascal’s triangle. 

A. We summed the rows of the triangle earlier; let’s now sum the 
diagonals of the triangle. We write it as a right triangle to make the 
pattern easier to see. Summing the first diagonals, we get 1, 1,2, 3, 5, 
8, and so on. These sums are the Fibonacci numbers. 

B. In any given row of Pascal’s triangle, how many of the numbers are 
odd? The top row has one odd number, the next row has two, the next 
row also has two, the next row has four, and so on. 

1 . The number of odd numbers in each row of Pascal’ s triangle is 
always a power of 2. In fact, it’s 2 raised to the number of 1 s in 
the binary expansion of n. Let’s look at an example of this. 


2. Row 8 1 of Pascal’s triangle has the numbers 


How many of those binomial coefficients are odd? 

The number 81, written in terms of powers of 2, is 
64 + 16+1, which in binary notation is 1010001. There are three 
1 ’s in that binary expansion of 8 1 , so that will be our exponent. 
The number of odd numbers in row 81 of Pascal s triangle is 2 3 — 
8 . 

The positions of the 8 odd numbers in row 81 of Pascal’s triangle 
are those numbers that can be formed using a subset (possibly 
empty) of the numbers 1,16, and 64. They are: 0, 1, 16, 64, 1 + 
16 = 17, 1 +64 = 65, 16 + 64 = 80, and 1 + 16 + 64 = 81. 


'8f 




8T 


9 



,81, 
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VIII. Let's end on a holiday note with “The Twelve Days of Christmas.” What 
is the total number of gifts received by the end of the 12 days? 

A. On the k' h day, you received 1 + 2 + 3 +. . .+ k, but we know that’s equal 

fk + l' 

to k(k + l)/2, which is also equal to the binomial coefficient 

l 2 , 

For example, on the 12 th day of Christmas, you receive 1 + 2 + 3 + ... + 

fl3 A 

12 gifts. That’s equal to (12)(13)/2, or 78 gifts; it’s also equal to 

H. All the numbers of gifts you 

receive (1, 3, 6, 10) lie on Pascal’s 
triangle. In fact, when we 
summed those numbers earlier, 
we got the hockey stick identity. 

In general, if we sum the numbers at right, the hockey stick identity tells 
^14' 

gifts altogether. 

v 3 J 

C ’. Calculating, that’ s ( 1 4)( 1 3)( 1 2)/3 ! , or 364 gifts. By the end of the song, 
you’ve received one gift for every day of the year, except Christmas. 


f 2'* 


T 


'4> 


'13' 


+ 


+ 


+ ...+ 


,2, 


,2, 


,2, 


,2, 


us that we get 


Binding: 

Arthur T. Benjamin and Jennifer J. Quinn, Proofs That Really Count: The Art of 
Combinatorial Proof 

Benedict Gross and Joe Harris, The Magic of Numbers, chapter 6. 

Questions to Consider: 

1. There are three odd numbers in the first two rows of Pascal’s triangle. How 
many odd numbers are in the first 4 rows, the first 8 rows, and the first 16 

t ows? Find a pattern. Can you prove it? Also, describe the resulting picture 
of Pascal’s triangle if you remove all the even numbers from it (or simply 
replace each even number with 0 and replace each number with 1). 

2. I 'house any number inside Pascal’s triangle and note the six numbers that 
sin round it. For example, if you choose the number 15 in row 6, then the six 
surrounding numbers are 5 and 10 (above it), 6 and 20 (beside it), and 21 

F and 35 (below it). Now draw two triangles around that number so that each 
trliiiigle contains three of those numbers. For example, the first triangle 
1 Would contain 5, 20, and 21, while the second the triangle would contain 10, 
and 35. Show that the product of both sets of numbers will always be the 
Bftllto. I or instance, 5x 20x 21 =2,100 and 10x 6x 35 = 2,100. This is 
^BCtlmcs called the Star of David theorem, because the two triangles form 
■Itar with the original number in the middle. 


©2007 The Teaching Company. 


53 


Lecture Twenty-Two 
The Joy of Probability 


Scope: In this lecture, we learn to calculate answers to such questions as: What 
are the chances that when flipping a fair coin three times, I will get 
exactly three heads? Although the outcome of any single random event 
may be hard to predict, by applying the central limit theorem, we can 
forecast what will happen with a large collection of random events. We 
also look at the concepts of independence and dependence, expected 
value, and variance. This lecture incorporates many subjects we’ve 
looked at previously, including infinite series, calculus, and e. 


Outline 

I. The easiest events to understand are those that have equally likely 
outcomes. 


A. 


B. 


C. 


For instance, the flipping of a coin has two possible outcomes, heads or 
tails. The probability of either outcome is 1/2. Probability is expressed 
as a number between 0 (impossible) and 1 (certain). 

In rolling a fair six-sided die, there are six possible outcomes, each of 
which has an equal probability of occurring. The probability of rolling 
any specific number is 1/6; of rolling an even number is 3/6, or 1/2; and 
of rolling a number that is 5 or larger is 2/6, or 1/3. 

There are eight sequences in which you can flip a coin three times, and 
each of those sequences is equally likely. Once you’ve flipped two 
heads, for example, the chance that the third flip is a head is still 1/2. 

1 . There are eight possible equally likely outcomes, but there is only 
one way to flip three heads, so the probability of that outcome is 
1/8. The probability of flipping two heads or one head is 3/8. The 
probability of flipping all tails is 1/8. 

2. In general, if you flip a coin n times, you have two equally likely 
possibilities for the first outcome, the second outcome, the third 
outcome, and the « ,h outcome; therefore, there are 2" different ways 
of flipping the coin n times. 

3. How many of those ways of flipping the coin result in exactly k 
heads? Among those n coin flips, choose k of them to be heads; the 
other ones will have to be tails. The number of ways of picking k 

/ x 


heads is 


' n ' 


v* 


. The probability of flipping k heads is 


r o' 


nr 


II. What is the probability that at least two people in a group of n people will 
have the same birth month and day? With just 23 people, there is at least a 
50% chance that two people in the room will have the same birthday. 


Tho TAQnhinn fnmnanv 


A. To see why that’s true, let’s answer the negative question: What’s the 
probability that everyone in the room has a different birthday? In other 
words, what are the equally likely events in this situation? 

1 . I f we write down lists of birthday s for everyone in the room, how 
many possible lists could we create? We’d have 365 choices for 
the first list, 365 choices for the second, and 365 choices for the 
last. The total number of lists that are possible is 365". 

2. How many ways can we create lists in which 

all the birthdays are different? There would be 365 choices for the 
first birthday, 364 choices for the second, 363 choices for the third, 
and so on, down to 366 - n choices for the last one. The probability 
that all those birthdays are different would be 
365 x 364 x 363 x ...(366 -n)/365". The probability that there’s at 

365' 

least one match among those people is 1 . 

365"(365-n)! 

II. If we plug in some numbers, we find that the probability of a birthday 
match with 10 people is 12%. With 20 people, the probability is greater 
than 40%, and with just 23 people, the probability is 50.7%. With 100 
people, the probability of a birthday match is 99.99996%. 

III. The notion of independence is important in probability problems. 

A. Two events, A and B, are independent if the occurrence of A does not 
affect the probability that B will occur. For example, the outcome of a 
coin flip has no influence on the outcome of a roll of a die. 

It. For independent events, the probability of A and B is the probability of A 
times the probability of B, or P(A and B ) = P(A) x P{B). For example, the 
probability of flipping heads is 1/2; the probability of rolling a 3 is 1/6. 
The probability that both events will occur is their product, 1/12. 
j (’. What’s the probability of rolling five 3 ’s in a row? The probability of 
rolling the first 3 is 1 out of 6, and the probability of rolling each of the 
other 3’s is also 1 out of 6. Because each of those rolls is an 
independent event, the probability of rolling five 3’s in a row is 1/6 5 . 

I). What’s the probability of rolling the first five digits of pi in order? 
liven though this sequence seems more random than the previous one, 
the probability of rolling this specific sequence is also 1/6 5 . 

I . The probability of rolling the numbers 1, 2, 3, 4, and 5 in that order is 
1/6’. But if we allow any order, each sequence has a probability of 1/6 5 . 
We can arrange the numbers 1 through 5 in 5!, or 120, ways. Thus, the 
probability of rolling 1, 2, 3, 4, 5 in any order would be 5!/6 5 . 

I 1 ', If I roll a six-sided die ten times, what’s the probability that I will roll a 
) •Xitctly two of those times? 

1. 1 could roll a 3, then another 3, then 8 numbers that are not 3s. The 

probability of that sequence is 1/6 for the first 3, 1/6 for the second 
1. and 5/6 for each succeeding number that is not a 3. Thus, the 
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probability of seeing the specific outcome of a 3, followed by a 3, 
followed by 8 non-3’s would be l/6 2 (5/6) 8 . 


2. However, there are 


( 10 \ 


\. J 


ways of rolling two 3’s., or 


flcA 


12J 


answer to the original question is 




3. 


sequences that have a probability of l/6 2 (5/6) 8 ; therefore, the 

10 Y - 

UJU . 

This is an example of a binomial probability problem, one of the 
most important kinds of problems that appear in probability. In 
general, when we perform an experiment, such as flipping a coin, n 
times, each of those experiments has a success probability of p. 

The number of successes, such as the number of heads, is x (called 
a binomial random variable, meaning that it has two possibilities). 


In that situation, P(x = k) = 


f n'] 


l k ) 


p k (\-pr k 


G. Let’s look at a geometric probability question : Suppose I roll a six- 
sided die repeatedly until I see a 3. The probability that the first 3 will 
appear on the 10 th roll is (5/6) 9 (l/6). 

IV. Let’s now switch our attention to problems involving dependence. 

A. For these problems, we need to know the conditional probability 
formula: The probability of A given B is the probability of A and B 

P{A and B) 

divided by the probability of 5, or P(A | B ) = — — — • 


B. 


Let’s say I roll a six-sided die and the outcome of that roll is x. Find the 
probability that x is equal to 6 given that x is greater than or equal to 4. 

1 . We know that the probability of getting any particular outcome is 
1/6, but the probability that the outcome will be greater than or 
equal to 4 is 1/3. The formula gives us that same conclusion. 

2. According to the formula, the probability that x = 6 given that x is 

P(x = 6 and x > 4) 


greater than or equal to 4 is: P(x - 6 | x > 4) 


P(x > 4) 


3. The numerator has redundant values: x = 6 and x > 4; thus, we can 
rewrite the numerator as P(x = 6). In the denominator, the 
probability thatx > 4 is 3/6. Therefore, the probability is 

1/6 1 
3/6 ~ 3 ' 


( What about the probability that x is even given x > 4? Using the same 
Idea, that’s: 

£(x is even and x > 4) _ P(x = 4or6)_2/6_2 
P(x > 4) P(x > 4) “ T/6 ~3’ 


l>. If A and B are independent events, the conditional probability formula 
tells us that the probability of A happening given B happens is the 
probability of A and B divided by the probability of B. But because A 
and B are independent, the probability of A and B is the probability of A 
times the probability of B divided by the probability of B: 


P(A I B) = 


P(A and B) 
P{B) 


P(A)P(B ) 
P(B) 


= P(A), 


which agrees with our notion of independence. 


I). 


V. Another important concept in probability is expected value. 

A. The expected value of a random variable x, which we denote E[x], is 
the weighted average value of all the possible values that x can take on. 
Specifically, E[x\ = ^ kP(x = k ) . 

Let’s say x could take on three values: 0 with a probability of 1/2, 1 
with a probability of 1/3, or 2 with a probability of 1/6. E[x\ is a 
weighted average of the numbers 0, 1, and 2, where those weights are 
the probabilities. In this case, E[x] = 0(1/2) + 1(1/3) + 2(1/6) = 2/3. 
Expected values have some properties that we might... expect. For 
instance, if a is a constant, E[ax ] = aE[x], 

The expected value of x + y is the expected value of x plus the expected 
value of y: E[x + y] = £[x] + E\y\. That’s true even if we add n random 
variables. In other words, the expected value of the sum is the sum of 
the expected value: £[x, + x 2 + ... + x„] = £[x,] + £[x 2 ] + ... + £[x„]. 
That’s true for any random variables, independent or not. 

Now we can apply the expected value of the sum to derive the expected 
value of a binomial random variable. 

Suppose I flip a coin n times, each with heads probability p, and x 
is the number of heads that I get. What is the expected number of 
heads when I perform this experiment n times? 

Your intuition might tell you that if p is 1/2, and I flip the coin n 
times, we expect about half the results to be heads. If the 
probability of heads is 2/3, then we expect the number of heads to 
be 2n/3. Thus, £[x] = np. We can derive this using an easy method 
that looks at each individual coin flip. 

1 lore’s the easy method: x, is equal to 1 if the z' th flip is heads and 0 
If it’s tails. In other words, x, = 1 with probability p and x, = 0 with 
probability 1 -p. Then, the total number of heads will be x x + x 2 + 
■4,,. +x„. In other words, we’re just counting the l’s. 


E. 


I. 


2 . 
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4. Thus, £[x,] = 1 {p) + 0( 1 - p) - p. 

5. In this way, E[x], which is £[x,] + £[x 2 ] + . . . + £[x„] (the expected 
value of the sum is the sum of the expected value) is equal to 
p+p + p+p + ...+p, a total of n times, which is np. 

VI. Variance measures the spread ofx. 

A. If £[x] = p (as in mean), then the variance of x (Var(x)) is defined as 
£[(x - p) 2 ]. In other words, the measure of the spread is the expected 
squared distance from the mean. The standard deviation ofx is the 
square root of that quantity. 

B. Here are some handy formulas for variance and standard deviation: 

1. Though we defined the variance of x in one way, in practice, it’s 
often easier to calculate it as £[x 2 ] - £[x] 2 . For example, in the 
problem we saw earlier, if the probability that x = 0 is 1/2, the 
probability that x = 1 is 1/3, and the probability that x = 2 is 1/6, 
then £[x 2 ] is a weighted average of all the possible values ofx 2 , 
which is l/2(0 2 ) + 1/3(1 2 ) + l/6(2 2 ) = 1. As we saw earlier, 

£[x] = 2/3; thus, Var(x) = £[x 2 ] - £[x] 2 = 1 - (2/3) 2 = 5/9. 

2. Another property of variance that is worth knowing is as follows: 

If X! through x n are independent random variables, then the 
variance of the sum is the sum of the variances. 

VII. So far, we’ve been dealing with discrete random variables, questions that 
have nice integer answers. But many random processes have continuous 
answers. We can address continuously defined quantities using calculus. 

A. We describe the probability of continuous quantities by a probability 
density function, a curve that stays above the x-axis and whose area 
under the curve is 1. With this function, the probability that x is between 
a and b is the area under the curve between a and b. 

B. Let’s use a probability density function ofx 2 /9. This is a legal 

3 2 

probability density function since I — dx . The probability that x is 

I 9 

o 

2 fx 2 7 

between 1 and 2 is l : — dx = — . 

J 9 27 

l 

C. Continuous random variables have similar formulas to discrete random 
variables. For instance, the expected value ofx if x is a continuous 
random variable, instead of being a weighted sum of the possible values 
ofx, is a weighted integral of the possible values ofx. Specifically, £[x] 
is the integral ofx times the density function of x with respect to x. ^ 
Similarly, to find the expected value ofx 2 , we take the integral ofx 2 times 
the density function ofx. 

VIII. Perhaps the most important continuous random variable of all is the 
normal distribution — the original bell-shaped curve. 


A. I'he most famous of these is the bell-shaped curve that has a mean of 0 
mid a variance of 1 , but these curves can have different sizes. 

H. I'he most general bell curve has a mean of p and a variance of o 2 . This 

-l 


has a rather imposing probability density function: 


1 


yflia 


710 


( livery normal distribution has the following property: The probability 
that a continuous random variable is within one standard deviation of 
its mean is about 68%, within two standard deviations, about 95%. 

I). The fact that the normal distribution is so common stems from the 
central limit theorem, which says that if we add up many independent 
random variables, we always get approximately a normal distribution. 

1 . In the coin flip, where a single coin flip could be heads or tails 
(probability 1/2), we can show that the expected number of heads 
in a single coin flip is 1/2. The variance of a single coin flip is 1/4. 

2. If we flip a coin 100 times, the expected number of heads is 50. 
The variance of the number of heads is 100 x 1/4 = 25. Thus, the 
standard deviation is 5. 

3. Though we can’t predict the outcome of a single coin flip, we’ve 
got a good handle on the outcome of 1 00 coin flips. That is, the 
outcome has an expected value of 50 and a standard deviation of 5 

4 . Since this has an approximate normal distribution, there is about 
95% chance that the number of heads will be between 40 and 60, 
that is, 50 ±2(5) . 

5 . We’ll see how to exploit more of this kind of information in our 
next lecture on mathematical games. 


lug: 

aril B. Burger and Michael Starbird, The Heart of Mathematics: An 
Italian to Effective Thinking, chapter 7. 

edict Gross and Joe Harris, The Magic of Numbers, chapters 8-13. 

slions to Consider: 

1110 people are each asked to think of a card, what are the chances that at 
ImiI two of them will think of the same card? 

game of Chuck-a-Luck, you bet $1 on a number between 1 and 6; 
Ihree dice are rolled. If your number appears once, you win $ 1 ; if your 
i appears twice, you win $2; and if your number appears three times, 
$3. (If your number does not appear, you lose $1.) On average, 
h should you expect to lose on each bet? 
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Lecture Twenty-Three 
The Joy of Mathematical Games 

Scope: Sometimes the probability that something happens is influenced by 

other given information. For example, the chance that a horse will win 
a race might change depending on whether it was running on a sunny 
day or a rainy day. In this lecture, we discuss conditional probability 
and the law of total probability. Applying these concepts and concepts 
from previous lectures, we analyze the chances of winning roulette and 
craps and predict the long-term losses from playing these games. 

Outline 

I. Let’s start with horseracing and Harvey, a horse who likes to run in the 

rain. 

A. If it rains tomorrow, Harvey has a 60% chance of winning the race, 
but if it doesn’t rain tomorrow, he has a 20% chance of winning the 
race. Our notation for this is: /’(win | rain) = .60 and /’(win | no 
rain) = .20. The question is: What’s the probability that Harvey will 
win the race? That answer depends on the actual probability that it 
will rain. 

B. The probability that Harvey will win is the weighted average of the 
probability that he wins when it rains and the probability that he 
wins when it doesn’t rain. If the probability of rain is 50%, the 
expression is: /’(win) = (,60)(.50) + (.20)(.50) = .40. If the 
probability of rain is 70%, the expression is: /’(win) = (,60)(.70) + 
(,20)(.30)= .48. 

C. Suppose that the probability of rain on race day is 99%. Harvey’s 
chances for winning now should be almost 60%. We take a 
weighted average of 60% and 20%, giving 60% a weight of .99 and 
20% a weight of .01. The weighted average of those numbers is 
.596; thus, Harvey has a 59.6% chance of winning, as our intuition 
told us. 

D. The probability that Harvey will win is governed by the law of total 
probability, which states, in general: If an event B has two possible 
outcomes, B x or B 2 , then P(A) = P(A | B|)P(J?i) + P(A | B 2 )P(B 2 ). 
Similarly, if B has n possible mutually exclusive outcomes, B x or B 2 
or ... B„, then P(A) = P(A | S 1 )P(B I ) +...+ P(A \ B n )P(B n ). 


II. I el ' h use this formula to analyze the game of craps. 

A. I o play craps, you roll two dice. Let’s call the total of those two 
dice the number B. If B is 7 or 1 1, you win immediately. If B is 2 or 
.1 or 12, you lose immediately. If B is 4, 5, 6, 8, 9, or 10, you keep 
tolling the dice until you get a sum of B — your original total — or a 
7 If a sum of B shows up first, you win, and if a 7 shows up first, 
you lose. 

II. According to the law of total probability, the probability of winning 
(event A) is P(A) = P(A | Bi)P(B x ) +...+ P(A | B n )P(B„). In craps, the 
B event is the total of the dice. It’s easier to determine the 
probability of winning at craps overall once we know what the 
number rolled is, and the law of total probability allows us to break 
this problem up into more manageable pieces according to the 
numbers rolled. 

, We’ll put all the information we need in a “craps table” (shown at the 
bottom of the page); then, we can figure out some of these probabilities. 

A. 1 low do we find the probability of seeing any particular number? 

1 . Imagine that one of the dice is green and the other one is red. 
There are 6 possible outcomes for the green die and 6 for the 
red die, or 6x 6 = 36 possibilities for the green/red 
combination. 

2 . Even though we’re only interested in the total of some number 
between 2 and 12 (and those are not equally likely), we’re just 
as likely to see a green 3 and a red 5 as a green 6 and a red 2. 
Thus, each of the 36 outcomes has the same probability. 

3 . Note that there is one way to roll a total of 2. There are two 
ways to roll a total of 3 (a green 2 and a red 1 or a green 1 and a 
red 2). All the possible outcomes are listed in the matrix below. 
To find the number of possible outcomes for each number, we 
just count the number of times a given number appears out of 
36. 



1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

7 

2 

3 

4 

5 

6 

7 

8 

3 

4 

5 

6 

7 

8 

9 

4 

5 

6 

7 

8 

9 

10 

5 

6 

7 

8 

9 

10 

11 

6 

7 

8 

9 

10 

11 

12 
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B. Knowing these outcomes, we can now start to fill in our craps table, 
focusing first on the shaded rows: 


B 

5(win | 5 ) 

P(B) 

Product 

2 

0 

1/36 

0 

3 

0 

2/36 

0 

4 

1/3 

3/36 

3/108 = 

.027777... 

5 

4/10 

4/36 

16/360 = 

.044444... 

6 

5/11 

5/36 

25/396 = 

.063131... 

7 

1 

6/36 

6/36 = .166666... 

8 

5/11 

5/36 

25/396 = 

.063131... 

9 

4/10 

4/36 

16/360 = 

.044444... 

10 

1/3 

3/36 

3/108 = 

.027777... 

11 

1 

2/36 

2/36 = .055555... 

12 

0 

1/36 

0 


1. For instance, the probability of winning given that 5 = 2 is 0; if 
you roll a 2, you’ve lost immediately. The probability of winning if 
you roll a 3 is also 0, as is the probability of winning if you roll a 
12. 

2. On the other hand, the probability of winning if your first roll is a 7 
is 100%, or 1, as is the probability of winning if B = 1 1. 

C. Now we turn to some of the trickier probabilities. For instance, what’s 
the probability of winning given that 5 = 4? There are two ways to 
answer that question. 

1. If the initial roll is 4, you keep rolling the dice until you see either 
another 4 or a 7. If a 4 shows up before a 7, you win. If a 7 shows 
up before a 4, you lose. From our matrix, we see that there are 
three ways out of 36 to roll a 4 and six ways out of 36 to roll a 7. 

2. The chance of winning on the next roll after you’ve rolled a 4 
would be 3/36. But what are the chances that you win two rolls 
after that first roll? You didn’t roll a 4 or a 7 on the next roll 
(P = 27/36); then you did roll a 4 or a 7 on the following roll 

(5 = 3/36). Multiplying those probabilities, we get (27/36)(3/36). 

3. You could win on the next roll — that is, no 7 or 4, no 7 or 4, 
followed by a 4. That has the probability (27/36) 2 (3/36), and so on. 


to 
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4 . This is an infinite series — a geometric series — and we know how 
to sum those. We factor out 3/36 to get 1 + 27/36 + 27/36 2 + ..., 
which has the form of a geometric series, 1 + x + X 2 + x 3 + ..., and 
we know that equals 1/(1 -x). When we do the algebra, we get 1/3 
as the probability of rolling a 4 before rolling a 7. 

I). Another way of answering that question is a bit more intuitive. 

1 . If we look at our matrix again, we see that there are three ways to 
roll a 4 and six ways to roll a 7. Thus, there are twice as many 
ways to roll a 7 as there are to roll a 4; therefore, it would make 
sense that you would be twice as likely to roll a 7 before you roll 
a 4. 

2. The only numbers that are relevant to winning in the matrix are the 
three 4’s and the six 7’s. One of those will be the first number that 
you roll, and three of those possibilities allow you to win and six 
cause you to lose. That’s why 5(win | 5=4) = 3/9 = 1/3, which 
agrees with our previous calculation. 

E. Let’s use this easier method to answer the next question: What’s the 
probability that you win given that 5 = 5? There are four ways to roll a 
5 in our matrix, and there are six ways to roll a 7. What’s the 
probability that the next number you roll is a 5 before you roll a 7? Of 
the ten possibilities, four of them are good and six of them are bad, so 
the chance will be 4/10. 

F. What about the probability that you win given that your initial roll was 
a 6 or an 8? Now you’ve got a better chance of winning because there 
are five ways to win and six ways to lose with each number; your 
chance of winning is 5/1 1 . 

C. Using this information, we now look at our completed craps table. The 
law of total probability tells us to multiply column 2 and column 3; the 
product then goes in column 4. To get the total probability of winning, 
we add up all those products, which gives us 244/495 = .492929..., or a 
49.3% chance of winning and a 50.7% chance of losing. 

H. If you know the rules of craps, you know that you can bet against the 
shooter. Every time the shooter loses, you win, except if the shooter’s 
initial roll is a double 6. In that case, the shooter loses, but you don’t 
win or lose; the result is called a push. That event adds to your losing 
probability by (l/36)(l/2) = 1/72 = .014, which makes up for the 
difference in the 49.3% chance of losing and 50.7% chance of winning. 

hitting these numbers together, the expected value when you bet $1.00 
Olt craps is as follows: 1(.493) -1(.507) = -.014. In other words, if you 
L bel $ 1 .00, then your expected value is -1 .4 cents. That doesn’t seem 
much, but if you play the game long enough, you’ll go broke. 

The expected value is -1 .4 cents. The variance of a single bet is 
almost $1.00. 

1 1' you make 100 bets and on average you lose 1.4 cents for every 
I vl, then after 100 bets, you will be down about $1.40. 
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3. The variance of the sum is equal to the sum of the variances, so the 
variance after 100 bets will be 100, but the standard deviation — 

the quantity we most care about — is V 100 = $10.00. Thus, your 
expected loss is $1.40, but the standard deviation is $10.00. You’re 
probably going to lose, but there’s a chance that you’ll still be on 
the positive side after 100 bets. 

4. After 10,000 bets, you’ll be down $140. Because the standard 
deviation grows with the square root of the number of bets, it will 
be about $100. You now have less than a 20% chance ofbeing in 
the black after 10,000 bets. 

5. After 1 ,000,000 bets, you will be down $ 14,000 with a standard 
deviation of 1,000. You will almost certainly be within two 
standard deviations of your expected loss; thus, you have a 95% 
chance ofbeing down somewhere between $16,000 and $12,000; 
there’s a 99% chance that you’ll be within three standard 
deviations — somewhere between $1 1,000 and $17,000 down. 

IV. A game that is easier to look at is roulette. 

A. In American roulette, we have 1 8 red numbers, 1 8 black numbers, and 2 
green numbers — the 0 and the double 0. If you bet on red, you win $ 1 .00 
with a probability of 18/38; you lose $1.00 with a probability of 20/38. 
Your expected value here is (18/38) - (20/38) = —2/38 = -.0526. You will 
be down about 5 1/4 cents for every bet. 

B. After 100 bets, you will be down about $5.26 with a standard deviation 
of $10.00. After 10,000 bets, you will be down $526 with a standard 
deviation of $100. Thus you will almost certainly be down somewhere 
between $200 and $800. 

V. Let’s close with something called the Gambler’s Ruin problem. 

A. In this problem, with each bet, you win $ 1 .00 with probability p and 
you lose $1.00 with probability 1 -p, or q. You begin with d dollars 
and your goal is to reach n dollars. Let’s say d = $60 and n = $100. 

B. The Gambler’s Ruin theorem has a beautiful formula for figuring out 
your chance of reaching n dollars without going broke: - — ^ ^ - , as 

1 -{q/pf 

long as qlp * 1. When qlp= l, which happens when p is 1/2, then the 
answer is din. Let’s look at the implications of this formula. 

C. If you walk into a casino and play a fair game (p = 1/2), what are the 
chances that you will go from $60 to $100 before reaching $0? The 
answer is 60%. If the game is fair and you start 60% of the way toward 
your goal, then you will reach your goal with a probability of 60%. 

1. In a game such as craps, however, where your probability of 

winning is 49.3%, your chance of reaching your goal is about 28%. 
If your probability of winning is 49% instead of 49.3%, your 
chance of reaching your goal goes to 19%. 
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2 . If you play a game such as roulette, where your probability of 
winning is 47.3% on any given play of the game, you have only a 
1 .3% chance of reaching your goal without going broke. 

3. On the other hand, if you know a little bit of gambling theory, you 
might be able to play blackjack with a 5 1% probability of winning, 
which means you can reach your goal with a probability of 93%. 

I 4 . The lesson here is that if you’re going to gamble, you might as 
well be smart about it. 

Hi >««llll|t: 

Mlirt In ( iardner, Martin Gardner’s Mathematical Games. 

Kdwitrd I'ackel, The Mathematics of Games and Gambling. 

Questions to Consider: 

1, When dealing cards on a table, what is the probability that an ace will 
appear before a jack, queen, or king appears? 

2. If you are dealt two cards at random from a deck of 52 cards, what is the 
probability that one of the cards is an ace and the other card is a 10, jack, 
queen, or king? 
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Lecture Twenty-Four 
The Joy of Mathematical Magic 

Scope: In this lecture, we apply some of the skills we’ve learned to perform 
various feats of mathematical magic. We begin with the creation of 
magic squares. A magic square is a box of numbers, designed so that 
every row, column, and diagonal sums to the same total. We will learn 
techniques for creating magic squares of various sizes, including a 
personalized method for creating a magic square based on someone’s 
birthday. We will also learn how to compute cube roots in your head, as 
well as how to do magic with numbers and cards to impress and amaze. 

Outline 

I. We begin with a trick that involves phone numbers and seems to be 
intriguing to many people. You may need a calculator to follow along. 

A. Let’s call the first three digits of your phone number x and the last four 
digits y. Here are the steps to follow: 

Multiply the first three digits by 80: 80x. 

Add 1: 80x+ 1. 

Multiply by 250: (80x + 1)250. 

Add the last four digits of your phone number: 

(80x + 1)250 +y. 

Add the last four digits again: (80x + 1)250 + y + y. 

Subtract 250: (80x + 1)250 +y + y-250. 

Simplify and divide by 2: (20,000x + 2y)/2. 

Answer: 1 0,000x + y = your phone number. 

B. When we get to the number 10,000x + y, we’re just attaching four 0’s 
to x, then adding the number y, which leaves us with the phone number. 

II. Let’s now turn to magic squares. 

A. We’ll create a magic square using my daughter’s birthday, December 3, 
1998. In the first row, we write: 12, 3, 9, 8. Adding those digits, we get 
32. Now, we have to fill out the rest of the square in such a way that 
every row and every column adds to 32. The result is on the left below. 


A 

B 

C 

D 

C- 

D+ 

A- 

B+ 

D+ 

C+ 

B- 

A- 

B 

A — 

D+ + 

C 


12 

3 

9 

8 

8 

9 

11 

4 

9 

10 

2 

11 

3 

10 

10 

9 


B. All the rows and columns in this square add to 32, as do the diagonals, 
the square in the middle, the squares in each of the comers, and the 
comers themselves. In fact, the four comers are the original numbers. 
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Ci 1 1 • i rente a birthday magic square of your own, suppose that the 

in l|(lnul birth date had numbers A, B, C, and D. Begin by writing A, B, 
i mid D, in every row, column, and diagonal in the arrangement 
uliuwn on the right above. 

I), I Ills kind of magic square, where every row and column has the same 
(bur numbers is called a Latin square. To make the Latin square a bit 
» more magical, we start with in the lower left-hand comer. We leave the 
H alone, but we change the C that’s in the third row, second column, to 
i ('ll (designated C+). Right now, the first diagonal will not add up 
correctly, so we fix that by changing A to A- With D, then, that group 
mlds up correctly. To get all the groups to balance, we fill out the rest 
of the square as shown on the right above. 

I. Notice that every row, column, diagonal, and group of four is balanced. 
We can now go back through this process to fill in the square for the 
birthday we started with. 

III. I lore’s a mathematical game that was inspired by a TV show: Mathematical 

Survivor. 

A. To keep the game simple, we start with six positive, one-digit numbers. 
In fact, however, this can be done with any number of numbers, and it 
will always work. Let’s use the first six digits of pi: 3, 1,4, 1, 5, 9. 

B. Choose any two of those six numbers to be removed. If we remove 3 
and 5, we’re left with 1,4, 1, 9. To replace the numbers we removed, 
we multiply the two numbers, add them, then add those two results: 

3(5) = 1 5, 3 + 5 = 8, and 15 + 8 = 23; that becomes the fifth number. 
Now, we have 1, 4, 1, 9, and 23. 

(’. Wc then repeat the process. Let’s say we eliminate 1 and 4. We 

multiply them, add them, then add the results: 1(4) = 4, 1 + 4 = 5, 4 + 

5 ■* 9, leaving the list as 1,9, 23, and 9. 

I). Repeating the process, we remove 9 and 23: 

9(23) = 207, 9 + 23 = 32, 207 + 32 = 239. 

The list is now 1, 9, 239. We then remove 1 and 239: 

1 (239) = 239, 1 + 239 = 240, 239 + 240 = 479. 

Now we’re left with just two numbers, and when we go through the 
process, the result is 4,799. Surprisingly, when we start with 3, 1,4, 1, 

' 5, 9, no matter what order we eliminate the numbers in, we will always 
end up with 4,799. 

^ We started with the numbers 3, 1,4, 1, 5, 9. To do the trick, I used 
numbers that are one greater than the original numbers, in this case, 4, 

2. 5, 2, 6, 10. 1 then multiplied these numbers together, which results in 
|,|()(). From that answer, I subtracted 1 to get 4,799. In general, if we 
fcd with the numbers a u a 2 , ... a„, the mathematical survivor will be: 

I )(a 2 + 1) ... (a„+ 1)- 1. 
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F. How does this work? Suppose you start with the numbers a\ through 
a„; I start with the numbers a\ + 1 through a n + 1. While you’re playing 
your game, I play a much simpler game. That is, whenever you choose 
two numbers, I also choose the corresponding numbers, but all I do is 
multiply mine together. At the end of the game, my numbers are simply 
the product of all the original numbers that I chose. 

1. Notice, however, that every time you replace the numbers a and b 
with ab + (a + b ), I replace ( a + 1) and ( b + 1) with ( a + 1 )(b + 1) = 
ab + a + b+ 1. My new number is one greater than your new 
number. That means that our lists begin one number apart 
everywhere, and they remain one number apart everywhere. 

2. For example, if you start with 3, 1,4, 1, 5, 9, 1 start with 4, 2, 5, 2, 
6, 10. When you replace 3 and 5 with 23 by multiplying, adding, 
and adding, I simply multiply 4 and 6 to get 24. My 24 is one 
greater than your 23. Term by term, my list of five terms is one 
greater than your list of five terms, and that will remain true at 
each step in the problem. 

3. Because I know that my list is guaranteed to be the product of my 
six numbers, 4,800, then you’re going to be left with a number 
that’s one less than mine, 4,799. 

IV. We’ve learned to do all kinds of amazing mental calculations in this course; 
let’s see how to do instant cube roots in your head. 

A. In order to do this, you first have to memorize a table of the cubes of 
the numbers 1 through 10. Here’s the table: 


1 3 = 1 

3 3 = 27 

XL 

II 

CTi 

7 3 = 343 

9 3 = 729 

K> 

u> 

II 

00 

kO 

II 

6 3 = 2 1 6 

8 3 = 512 

10 3 = 1,000 


B. Notice that each of the last digits in the cubes is different. Also note 
that when you cube a number, it ends in the same number (for example, 
l 3 ends in 1, 4 3 ends in 4), or it ends with 10 minus that number (for 
example, 2 3 ends in 8, 8 3 ends in 2). 

C. Suppose someone tells you that a two-digit number cubed is 74,088. 
First, listen for the thousands. In this case, it’s 74,000. We know that 4 3 
is 64 and 5 3 is 125. That means that 40 3 is 64,000 and 50 3 is 125,000. 
This cube must lie between 64,000 and 125,000, or between 40 3 and 
50 3 . That tells us that our answer must be 40-something. 

D. Because we know the answer is a perfect cube, all we have to do is 
look at the last digit of that cube, in this case, 8. Only one number when 
cubed ends in 8, namely, 2. Thus, the last digit of the original two-digit 
number had to be 2, and 42 must be the original cube root. 

E. Let’s do one more example. Suppose I cube a two-digit number and 1 
tell you that the answer is 68 1 ,472. Once again, listen for the 
thousands — that’s 681,000. The number 681 is between 8 3 and 9 3 , or 


T1 


■ I ' mid 729. That means that the original number must begin with 8. 

I lie Inst digit of the cube is a 2. Only one number when cubed ends in 
’ nttmely, 8. Thus, the original number had to be 88. 

V, IM like to end, finally, with a card trick. I’m not going to give you the secret 
In this curd trick, but I have confidence that ’ with all the math you’ve 
I turned in this course, you will be able to figure it out if you watch it a few 

Ilmen. 

A This trick works with the 10’s, jacks, queens, kings, and aces from the 
■ dock —20 cards. I begin by shuffling the cards to my heart’s content — 
or your heart’s content. When you tell me to stop, I will keep the cards 
in that order, but I will ask you to choose whether I should turn some of 
the cards face up or face down or whether I pair up some of the cards 
mid keep them in the same order or flip them. In this way, we 
“rundomize” the cards. Finally, I deal the cards out into four rows of 
live, but you choose whether I deal each row out from left to right or 
from right to left. 

II. The cards now are in a completely random order. I consolidate the 
cards by folding the rows together. You choose whether I fold the left 
edge, the right edge, the top, or the bottom. Recall that when we started 
this trick, I shuffled the cards to your heart’s content. I can tell that 
your heart was content because if we look at the cards that are now face 
up, we have here the 10, jack, queen, king, and ace of hearts. 

< . I hope in this course you’ve been able to experience the joy of math, 
indeed, the magic of math, as much as I have. 

Bending: 

Althur Benjamin and Michael Shermer, Secrets of Mental Math: The 

iitthtMtigtcion '.v ( iuidc to Lightning Calculation and Amazing Math Tricks. 

Htln Gardner, Mathematics, Magic, and Mystery. 

(Jiieidl'm* to < nnilder: 

^HhippOM I cube a two-digit number and the answer is 456,533. What was 
the original two-digit number, that is, the cube root? 

2, I low was the final card trick of this lecture done? Here’s a hint: At the 

beginning of the trick, when did the magician offer the choice of “face up or 
face down” and when did the magician offer the choice of “keep or flip”? 

As for why the folding procedure works, you might look at the hint given in 
the second problem of Lecture Ten. 
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